Wednesday, 05 February 2014

Dart vs. ClojureScript: two weeks later

A couple of weeks ago, I wrote about my first impressions of the Dart programming language in preparation for GDG Houston’s Dart Flight School event coming up on the 22nd of February. Since then, I have finished the new code lab, the Dart tutorials, and the AngularDart tutorial. For comparison’s sake, I did all but the AngularDart tutorial in both Dart and ClojureScript to get a feel for the differences between the two languages. I have published my ClojureScript implementation of the Dart tutorials on Github.

After working my way through all of this code, what’s my take on Dart and ClojureScript? I’ll start with addressing errors from my previous post and then compare Dart and ClojureScript in the following areas:

  • Standard libraries
  • Ecosystem
  • Tooling
  • Debugging
  • Documentation
  • Outside the browser
  • Integrating with Polymer
  • Integrating with Angular
  • Asynchronous programming support

Errata from ‘First impressions’

Before I get to the comparisons, I would like to correct some things I got wrong last time.

Static typing

Dart supports a mix of dynamic and static typing. You can program in Dart without ever declaring a type. However, without the static types, static analysis tools available for Dart will be less effective. Nonetheless, it is a choice you get to make. For example, take the following program:

void main() {
  var noType = “foo”;
  noType = 2;

  String withType = “foo”;
  withType = 2;

  print(“$noType $withType”);
}

An IDE or the dartanalyzer program will flag line 6 above and give you a warning that A value of type 'int' cannot be assigned to a variable of type 'String'. Nonetheless, the program will run just fine and output 2 2. However, running the program with ‘checked mode’ enabled (either as a runtime option to the Dart VM or a compile-time option when compiling to JavaScript) will produce an exception at line 6 with a message akin to type 'int' is not a subtype of type 'String' of 'withType'.

There is one place where Dart’s type system does irk me: only false is false and only true is true. In the following program, all of the print statements will print that the value is false (in unchecked mode):

bool isTrue(var v) {
  if (v) {
    return true;
  } else {
    return false;
  }
}

void main() {
  print(“0 is ${isTrue(0)}“);
  print(“1 is ${isTrue(1)}“);
  print(“empty string is ${isTrue(")}“);
  print(“non-empty string is ${isTrue(“foo”)}“);
  print(“null is ${isTrue(null)}“);
}

In this case, the static analyser does not find any problems with the program, but running it in checked mode will produce a type error since the expression for the if condition must be a boolean type.

I have grown used to things like null pointers or zero being false and non-null pointer and non-zero integers being true. I find needing to explicitly make an equality check annoying.

Clojure has core.typed, a library to add gradual typing to Clojure programs. However, using it is not nearly as seamless as is choosing to use static typing in Dart.

Serialisation

This is one area where I got a lot of feedback last time. First, a few points:

  • It is idiomatic Dart to serialise to JSON.
  • Dart’s JSON library can automatically handle num, String, bool, and Null types. List and Map objects can also be automatically serialised subject to a few constraints.
  • Dart also has a serialistion library that serialises objects reflectively or non-reflectively. It is fairly powerful and highly customisable.

What’s the fallout from the above with respect to ClojureScript? I have a couple of thoughts:

  1. ClojureScripts’s extensible data notation (EDN) is richer than JSON, making it deal with complex data structures and retain semantic information. For example, is a list of items intended to have constant random access (a vector) or contain unique elements (a set)? Additionally, it is extensible and allows you to add support for application-specific types.
  2. ClojureScript’s data-centric approach (using the built-in data structures rather creating new types) makes serialisation very easy. If you follow the same approach in Dart, you can enjoy many of the same benefits. However, as soon as you introduce new types in either language, the situation becomes more difficult.

In conclusion, it seems like if you stick with built-in types, both languages have a comparable serialisation story. Nonetheless, I think that the idiomatic Clojure approach to data combined with the richness of EDN gives it an edge over Dart.

Dart vs. ClojureScript

Now that I have spent some more time with both languages, I can make more informed and helpful comparisons between the two. One thing to keep in mind is that these experiences come from going through the Dart tutorials, so they may not play to ClojureScript’s strengths.

Standard library

Dart takes a ‘batteries-included’ approach to its standard library, making it easy to write both web and command-line applications without depending on external libraries.

In comparison, ClojureScript is very much a hosted language. While it has superior support for functional programming and manipulating data structures, when the time comes to move on from solving koans to writing a real application, you discover you have to learn your target platform and how to interoperate with it.

I think of it this way:

Developer: Hi, I want to write a web application.

ClojureScript: Great!

Developer: Um… how do I get a reference to a DOM element?

ClojureScript: That depends.

Developer: On what?

ClojureScript: Well, are you going to use JavaScript directly, use Google’s Closure library, or try a ClojureScript library like Domina or Enfocus? There are also radical alternatives to manipulating the DOM like Pedestal and Om.

Developer: Uh… I don’t know.

Developer spends the next half-day evaluating the ClojureScript options.

Some days later:

Developer: Well, now I need to do something different. I need to use HTML5 IndexedDB.

ClojureScript: Great!

Developer: Is there a nice library for that?

ClojureScript: Sorry, you’ll need to stick to JavaScript or Google Closure. I hope you love callbacks and have brushed up on your interop skills.

Developer groans.

Even later:

Developer: Now I’d like to write a command line utility. I love Clojure, but its overhead is just too big. Can I use ClojureScript?

ClojureScript: Absolutely!

Developer: Great. How do I get started?

ClojureScript: Well, all you need to do is learn Node.js. You’ll find your interop and callback management skills handy.

Developer: I don’t suppose there are any ClojureScript libraries that will make this much easier?

ClojureScript: Nope, you’re on the wild frontier. There are some nice Node modules, though.

Developer considers using Python instead.

Ecosystem

Of course, there is a world beyond the standard library. I can’t account for the quality of libraries, but I can comment on the quantity:

Repository Number
Clojars (Clojure)
 libraries with a dependency on ClojureScript 188
 total number of libraries 8,270
CPAN (Perl) 129,130
Maven Central (Java) 70,518
NPM (JavaScript/Node.js) 57,443
Pub (Dart) 690
PyPI (Python) 39,573
RubyGems.org (Ruby) 69,863

Both ClojureScript and Dart have far fewer libraries than other, more established, languages. It seems that Dart does have more native libraries available than ClojureScript, and both can take advantage of JavaScript libraries (like those from NPM). It’s hard to tell which language has true edge in terms of ecosystem.

Tooling

Dart ships an all-in-one package, Dart Editor, that includes:

  • The Dart SDK,
  • A specialised build of Eclipse specialised for Dart (the Dart Editor), and
  • Dartium, a special build of Chrome that includes the Dart VM.

Additionally, there is Dart support as plug-ins for:

  • IntelliJ IDEA and WebStorm
  • Eclipse
  • Emacs
  • Sublime Text 2
  • Vim

I tried the IntelliJ IDEA plug-in, and it seems to be largely on par with Dart Editor, including features like static analysis, code completion, refactoring, and debugging. I also tried the Vim plug-in, but all it does is syntax highlighting.

I believe the Eclipse plug-in is the same as what is bundled with Dart Editor. I cannot speak for the Emacs or Sublime Text 2 support for Dart.

All in all, the tooling story for Dart is pretty solid. For a beginner, there is one download that contains everything to get started. For experienced developers, there is a good chance there is some level of Dart support in their preferred tools.

ClojureScript shares much of the same tooling story as Clojure. I’m not sure what the state of getting started on Clojure is these days, but it seems like Light Table is quickly becoming a popular recommendation.

As an experienced Clojure developer with an established Clojure working toolset, I still find working with ClojureScript more difficult than it ought to be. vim-fireplace supports ClojureScript, but I could never get a configuration that gave me an entirely satisfactory REPL experience. Even when I did manage to connect to the browser, it didn’t seem like my changes were having an effect. Also, features like documentation lookup no longer worked. I’ll accept that this may all be my own fault, but in the end I was back in an edit/compile/reload loop that at times seemed painfully slow (up to about 30 seconds for a recompile).

I have used Light Table with the ClojureScript and the Om tutorials. Undoubtedly, having the instant feedback from using the REPL makes development a much more efficient and enjoyable experience.

Debugging

Although this falls into tooling, I thought I’d draw special attention to debugging. As I mentioned earlier, you can use the debugger in Dart Editor, Eclipse, IDEA, or WebStorm together with Dartium and get a pretty good experience. Dart also prepares source maps when compiling to JavaScript, easing debugging on the browser.

One common complaint about Clojure is its poor error messages. I have never felt that was the case, but I came to Clojure with a lot of JVM experience. I think that ClojureScript definitely lends credence to the notion. It’s possible to enable source map support for ClojureScript, and it helps. However, that also significantly slows down compilation speed. Given how difficult it is to figure out what actually went wrong (often just a mistyped name) from a JavaScript stack trace, I began to really appreciate the static analysis support for Dart.

Documentation

Dart has excellent documentation. There are many tutorials, articles, code labs, and examples all on-line and accessible from Dart’s home page. Many of the tutorials also don’t presume a lot of specific development experience. As an example, this is very helpful to an experienced back-end developer who is just getting started with developing web applications for the browser.

There are good resources for ClojureScript, but they are spread about the web, primarily in different people’s blog posts. The wiki on ClojureScript’s GitHub has some good links, but I found most of my resources through web searches. Additionally, many resources presume that you’re already familiar with the underlying platform, making it just a bit harder to understand what’s going on and how to get started.

Outside the browser

One of the selling points of Dart is that in addition to being able to create browser-based web applications, you can write the server in the same language. This is great; you can reuse some of the same code seamlessly in both the client and the server. Additionally, it is possible to write command line utilities and even interact with native libraries written in languages like C or C++.

This is also possible with ClojureScript. While it is possible to write ClojureScript servers that run atop Node.js, it is much more common to write the server side in Clojure. As with Dart, the primary benefit of this arrangement is the ability to share code between the server and the client. Additionally, the fact that both ends can easily speak EDN to each other helps. The trickiest part of this combination is that there are subtle differences between Clojure and ClojureScript, and you have to be careful to keep your cross-language libraries within the intersection of the two.

Integrating with Polymer

Polymer is a library built by Google on top of Web Components designed to make it easy to create and reuse bits of functionality in web pages using custom elements. It is largely JavaScript-based, but there is a port of it to Dart, Polymer.dart. I don’t have any previous experience with Polymer, but I got a nice taste of it working through the tutorials.

Working with Polymer.dart was a relatively easy experience. It, along with Polymer itself, is still in a pre-release state. It’s not quite feature-complete compared to Polymer itself, but seemed pretty solid as a whole. I felt the trickiest part of using Polymer.dart was ensuring that the two-way data binding on nested data structures worked well.

There is no equivalent library for Polymer in ClojureScript, so it’s necessary to use ClojureScript’s interop with Polymer. As a result, you can do it, but getting the data binding to work with ClojureScript is an absolute pain. A good library would go far in making working with Polymer more palatable.

Integrating with Angular

AngularJS is an MVC framework for building large web applications, and AngularDart is said to be the future of the framework. AngularDart is not a strict 1:1 port of AngularJS. It works a bit differently and has been said to be ‘Angular reimagined’.

This is my first exposure to Angular of any flavour, and my impression is that it is a fairly neat framework. I enjoyed working my way through the AngularDart tutorial, but it is clear that it is still a pre-1.0 product. It’s not so much that the library is buggy, but the developer documentation is lacking compared to other parts of the Dart ecosystem.

I have not tried much Angular with ClojureScript; I simply haven’t had the time. There are multiple efforts to make Angular and ClojureScript work better together. Given the popularity of AngularJS, I wouldn’t be surprised if a good AngularCLJS library comes about.

In conclusion, it’s still early for ports of Angular to other languages. However, given that Google is behind AngularDart and pushing it forward, I expect it to mature much more quickly.

Asynchronous programming

Dart has a couple of key features for facilitating asynchronous programming:

  1. Futures, a way of performing work asynchronously; and
  2. Streams, a way of asynchronously managing series of events.

Dart’s futures are quite similar to Clojure’s, though a bit richer. For example, there is built-in support for chaining futures and for propagating errors through a chain of futures. Dart’s streams provide a way of acting on a series of events, such as getting data from a socket or reacting to UI events. Both of these features help ameliorate the ‘callback hell’ problem that’s associated with JavaScript.

In comparison, ClojureScript has no native support for either one of these mechanisms. However, there is core.async, a powerful Clojure and ClojureScript library for asynchronous programming. With it, it is possible to write highly asynchronous code in a fashion that reads as if it were synchronous. This makes the code significantly easier to reason about. David Nolen has written a good introductory article about the power of core.async. The main downside to core.async I have run into is that it makes debugging more difficult due to the immense transformation of the code at compile time.

In the end, while I think Dart’s approach to handling asynchronous programming is fairly decent, it doesn’t have the power of core.async.

Final thoughts

Dart was designed to be a language that is easy to learn and can scale to large projects, and I think it has accomplished that goal. If someone with a background in a language like Java or C++ asked me about a language for developing web applications, I would definitely recommend that they consider Dart. With Dart, as with ClojureScript, it is possible to write both the client and the server in the same language reusing the same code. In fact, it’s probably easier in Dart than a hybrid Clojure/ClojureScript application.

Does this mean I think Dart is better than ClojureScript? In a word, no. I would still recommend ClojureScript to Lisp aficionados and adventurous programmers. Most importantly, I believe ClojureScript’s Lisp roots make it a playground for innovation. I do not think something like core.async’s go macro is possible in a language like Dart. With a working browser REPL, ClojureScript should have the same highly-interactive development experience that Clojure provides, and that makes programming a much more enjoyable and productive experience.

In the end both Dart and ClojureScript are great languages. Dart is probably the more ‘practical’ of the two, and certainly the easiest to pick up. However, ClojureScript is more powerful and, in my opinion, fun.

Friday, 24 January 2014

How Clojure works: more on namespace metadata

In the post How Clojure works: namespace metadata, I commented on how the metadata seemed to be missing from the following macro-expansion of ns:

(ns greeter.hello
  "A simple namespace, worth decompiling."
  {:author "Daniel Solano Gómez"})

; macro-expands once to (with some cleanup):
(do
  (in-ns 'greeter.hello)
  (with-loading-context (refer 'clojure.core))
  (if (.equals 'greeter.hello 'clojure.core)
    nil
    (do
      (dosync 
        (commute @#'*loaded-libs* 
                 conj 
                 'greeter.hello)) 
      nil)))

*print-meta*

Stuart Sierra pointed out in a comment that if we set *print-meta* to true, the metadata actually shows up three times in the macro-expansion:

(do
  (in-ns (quote ^{:author "Daniel Solano Gómez",
                  :doc "A simple namespace, worth decompiling."}
                greeter.hello))
  (with-loading-context (refer 'clojure.core))
  (if (.equals (quote ^{:author "Daniel Solano Gómez",
                        :doc "A simple namespace, worth decompiling."}
                      greeter.hello)
               'clojure.core)
    nil
    (do
      (dosync (commute @#'*loaded-libs*
                       conj
                       (quote ^{:author "Daniel Solano Gómez",
                                :doc "A simple namespace, worth decompiling."}
                              greeter.hello)))
      nil)))

A couple of observations:

  1. While it was obvious that the Clojure wasn't losing the metadata, now we can actually see how it gets processed.

  2. Even though the metadata is expanded three separate times, it only shows up twice in the compiled result. Apparently, when compiling a particular class, the compiler keeps tracks of what symbols are being used and deduplicates them.

This last point got me thinking: how sensitive is the compiler to the symbols and metadata it encounters?

Modifying the macro-expansion

To find out how sensitive the compiler is to symbols and their metadata, we can replace the ns form above with its macro-expansion and modify it just slightly:

(do
  (in-ns (quote ^{:author "Daniel Solano Gómez",
                  :doc "A simple namespace, worth decompiling."}
                greeter.hello))
  (clojure.core/with-loading-context (clojure.core/refer 'clojure.core))
  (if (.equals 'greeter.hello 'clojure.core)
    nil        
    (do               
      (dosync (commute @#'clojure.core/*loaded-libs*
                       conj
                       (quote ^{:author "Daniel Solano Gómez",
                                :doc "A simple namespace, worth decompiling."}
                              greeter.hello)))
      nil)))                  

The only change is in the comparison on line 6 where we have removed the metadata from the greeter.hello symbol. Functionally, this has no effect as metadata doesn't affect equality. However, does this change the generated code?

Examining the impact

As a matter of fact, it does. We have been careful so that the change has only affected the greeter.hello__init class. Just looking at the class signature, we can see this change made an impact:

package greeter;

import clojure.lang.*;

public class hello__init {
  public static final Var const__0;
  public static final AFn const__1;
  public static final AFn const__2;
  public static final AFn const__3;

  static {}

  public static void load();
  public static void __init0();
}

There is now an additional class AFn constant. When we see the decompiled __init0 method, we can see exactly what has changed:

public static void __init0() {
  const__0 = (Var)RT.var("clojure.core", "in-ns");
  IObj iobj = (IObj)Symbol.intern(null, "greeter.hello");
  Object[] meta = new Object[4];
  meta[0] = RT.keyword(null, "doc");
  meta[1] = "A simple namespace, worth decompiling";
  meta[2] = RT.keyword(null, "author");
  meta[3] = "Daniel Solano Gómez";
  IPersistentMap metaMap = (IPersistentMap)RT.map(meta);
  const__1 = (AFn)iobj.withMeta(metaMap);
  const__2 = (AFn)Symbol.intern(null, "greeter.hello");
  const__3 = (AFn)Symbol.intern(null, "clojure.core");
}

These changes include:

  • On lines 5-8, the order of the map metadata was changed. I don't think it's a significant change, but it is a change nonetheless;
  • On line 11, const__2 now holds a version of the symbol greeter.hello without the metadata; and
  • On line 12, const__3 holds the reference to clojure.core, which used to be in const__2.

When we examine the decompiled output of load(), we see that, as expected, the version of the greeter.hello symbol that has the metadata is used for the in-ns call and the version without the metadata is used in the comparison to clojure.core:

public static void load() {
  // (in-ns 'greeter.hello)
  IFn inNs = (IFn)const__0.getRawRoot();
  inNs.invoke(const__1); // version with metadata

  // (with-loading-context (refer 'clojure.core))
  IFn loading4910auto = (IFn)new greeter.hello$loading__4910__auto();
  loading4910auto.invoke();

  // (if (.equals 'greeter.hello 'clojure.core)
  //   nil
  //   (do
  //     (LockingTransaction/runIntransaction (fn* …))
  //     nil))
  Symbol greeterHello = (Symbol)const__2; // version without metadata
  if (greeterHello.equals(const__3)) {
    return null;
  } else {
    Callable callable = (Callable)new greeter.hello$fn__17();
    LockingTransaction.runInTransaction(callable);
    return null;
  }
}

Closing thoughts

First, a big thanks to Stuart Sierra for the tip about *print-meta*, it has been really helpful.

Second, examining the Compiler source code, it becomes a bit clearer what's going on: as the compiler encounters constants, it stores them in a vector to be emitted later. Additionally, it ensures that constants are not duplicated by using an IdentityHashMap, which relies on identity rather than equality. As such, we can see how the two symbols (with and without metadata) would be considered different.

However, what's not entirely clear is how the compiler knows that in the original macro-expansion that the two symbols with metadata are identical. I spent some time studying the compiler source, but it's somewhat hard to follow. I could probably use a debugger to trace its execution, but that's an exercise for another day.

Monday, 20 January 2014

How Clojure works: namespace metadata

In the first How Clojure works post, we examined how a Clojure namespace bootstraps itself. In particular, we saw how beguiling the following program can be.

(ns greeter.hello)

This program actually ends up creating three classes, including two anonymous function classes, each with a static initializer and a handful of constants.

Although I had promised looking at how a def works, I'd like to first add a bit more to our namespace declaration. Let's add some metadata:

(ns greeter.hello
  "A simple namespace, worth decompiling."
  {:author "Daniel Solano Gómez"})

We have added a namespace docstring as well as an attribute map that will be added to the namespace metadata. What do you think will be the result?

Anticipating the changes

Last time, we saw that ns is a macro that actually does quite a bit. So, let's expand it once (and clean up the result so that it can be read):

(do
  (in-ns 'greeter.hello)
  (with-loading-context (refer 'clojure.core))
  (if (.equals 'greeter.hello 'clojure.core)
    nil
    (do
      (dosync 
        (commute @#'clojure.core/*loaded-libs* 
                 conj 
                 'greeter.hello)) 
      nil)))

Well, that's interesting. It's not any different than what we had before. Where did the metadata go? Is it possible that it's all lost? That's not likely. Further macro-expansion won't help, so let's start decompiling.

Edit: As we see in the follow-up entry, using *print-meta* allows us to see the metadata.

Decompilation overview

When we look at the list of generated classes, we find the same three generated classes as before¹:

  • greeter.hello__init
  • greeter.hello$fn__17
  • greeter.hello$loading__4910__auto__

We are still not seeing anything new, so it's time to break out the decompiler and see what's going on at a deeper level. Let's start with the namespace class, greeter.hello__init.

greeter.hello__init

The class signature of greeter.hello__init hasn't changed:

package greeter;
import clojure.lang.*;

public class hello__init {
  public static {};

  public static final Var const__0;
  public static final AFn const__1;
  public static final AFn const__2;

  public static void load();
  public static void __init0();
}

However, if we examine the decompiled code, we find some changes to the __init0 method, so let's take a closer look at that.

__init0()

Examining the new content of the __init0 method, we begin to see what's going on:

static void __init0() {
  const__0 = (Var)RT.var("clojure.core", "in-ns");
  IObj iobj = (IObj)Symbol.intern(null, "greeter.hello");
  Object[] meta = new Object[4];
  meta[0] = RT.keyword(null, "author");
  meta[1] = "Daniel Solano Gómez";
  meta[2] = RT.keyword(null, "doc");
  meta[3] = "A simple namespace, worth decompiling";
  IPersistentMap metaMap = (IPersistentMap)RT.map(meta);
  const__1 = (AFn)iobj.withMeta(metaMap);
  const__2 = (AFn)Symbol.intern(null, "clojure.core");
}

As before, const__0 refers to the clojure.core/in-ns var and const__2 refers to the clojure.core symbol. The big difference here is that Clojure is no longer storing the greeter.hello symbol it creates. Instead, it creates that symbol, 'adds'² the metadata to the symbol, and stores the result in const__1.

This explains, to some extent, where the metadata went. It has been preserved by the compiler, but how can the Clojure runtime access the metadata? The greeter.hello__init class doesn't implement IMeta. It seems unlikely that the runtime would scour the class constants of loaded namespace classes looking for metadata.

Clearly, there is more to investigate. Let's take a look at the greeter.hello$loading__4910__auto__ class next.

greeter.hello$loading__4910__auto__

This is the class that implements (with-loading-context (refer 'clojure.core)). It hasn't changed as a result of the new metadata, so let's move onto the last generated class.

hello$fn__17

This is the anonymous function class that registers the namespace with Clojure. It effectively implements the following Clojure code:

(commute @#'clojure.core/*loaded-libs* 
         conj 
         'greeter.hello)

Decompiling the class, we see that it hasn't changed much. As with the greeter.hello__init, the class signature is identical. In this case, the implementation of the static initialiser differs:

static {
  const__0 = (Var)RT.var("clojure.core", "commute");
  const__1 = (Var)RT.var("clojure.core", "deref");
  const__2 = (Var)RT.var("clojure.core", "*loaded-libs*");
  const__3 = (Var)RT.var("clojure.core", "conj");
  IObj iobj = (IObj)Symbol.intern(null, "greeter.hello");
  Object[] meta = new Object[4];
  meta[0] = RT.keyword(null, "author");
  meta[1] = "Daniel Solano Gómez";
  meta[2] = RT.keyword(null, "doc");
  meta[3] = "A simple namespace, worth decompiling";
  IPersistentMap metaMap = (IPersistentMap)RT.map(meta);
  const__4 = (AFn)iobj.withMeta(metaMap);
}

As before, the first four class constants refer to the vars for clojure.core/commute, clojure.core/deref, clojure.core/*loaded-libs*, and clojure.core/conj. For the fifth class constant, instead of storing the symbol greeter.hello directly, it adds the metadata to the symbol before storing it in the class constant. So what are the consequences of this?

Well, when invoke() is called on this anonymous function, it ensures that clojure.core/*loaded-libs* will contain the symbol that contains the metadata. So, this must be where the namespace metadata comes from, right?

Digging deeper

At this point in my investigation, I was a little bit confused. At first, I thought that the namespace metadata must come from the *loaded-libs* var, but that's just a ref to a sorted set of symbols. However, if I want to get the metadata from a namespace at the REPL, I use (meta (find-ns 'greeter.hello)), and the type of the object returned by find-ns is a Namespace instance, not a Symbol. This got me thinking: what is the purpose of *loaded-libs* and where is the Namespace instance created?

The purpose of *loaded-libs*

*loaded-libs* is a private var declared in core.clj. You can get its content, a sorted set of symbols, via the loaded-libs function. It is used indirectly by require and use to keep track of what namespaces have been loaded. For example, when you use require without :reload or :reload-all, the presence of the namespace name symbol in *loaded-libs* will keep the namespace from being reloaded.

When using :reload-all, Clojure uses an initially-empty, thread-local binding of *loaded-libs*. This allows all dependencies of the desired library to be reloaded once, and the resulting set of loaded namespace name symbols is added to root binding of *loaded-libs*.

As a result, this means that the metadata used for the *loaded-libs* is not the metadata we get from the namespace object. For that, we'll have to take a closer look at the metadata attached to the symbol at greeter.hello__init/const__1.

Another look at greeter.hello__init/load

Looking back at greeter.hello__init, the namespace name symbol with metadata is stored in a class constant, const__1. The only place where this constant is used is in the load() method, which is decompiled as follows:

public static void load() {
  // (in-ns 'greeter.hello)
  IFn inNs = (IFn)const__0.getRawRoot();
  inNs.invoke(const__1);

  // (with-loading-context (refer 'clojure.core))
  IFn loading4910auto = (IFn)new greeter.hello$loading__4910__auto();
  loading4910auto.invoke();

  // (if (.equals 'greeter.hello 'clojure.core)
  //   nil
  //   (do
  //     (LockingTransaction/runIntransaction (fn* …))
  //     nil))
  Symbol greeterHello = (Symbol)const__1;
  if (greeterHello.equals(const__2)) {
    return null;
  } else {
    Callable callable = (Callable)new greeter.hello$fn__17();
    LockingTransaction.runInTransaction(callable);
    return null;
  }
}

As we see here, there are two places where this constant is used:

  1. In lines 3-4, it is used as the argument for in-ns.
  2. In lines 15-16, it is used in a comparison to 'clojure.core.

In the second case, the metadata has no effect, but what of the first?

A closer look at in-ns

in-ns is a bit special. Unlike most of clojure.core, it is not defined in core.clj. Instead, it is constructed within RT.java. Its value is actually an anonymous AFn implementation also defined in RT.java. This implementation is fairly simple, and the noteworthy bit is that the symbol that is passed to in-ns is further passed to the static method clojure.lang.Namespace/findOrCreate.

Class diagram of clojure.lang.Namespace

Namespace contains a static member called namespaces, which is a map³ of namespace name symbols to Namespace object instances. When findOrCreate is called and there is no mapping for the symbol yet, a new Namespace instance is created and inserted into the map.

The Namespace class extends clojure.lang.AReference, which holds metadata and indirectly implements clojure.lang.IMeta. As such, the Namespace constructor uses the metadata from the namespace name symbol as its metadata.

At last, we now know how a namespace gets its metadata. Looking at the implementation of find-ns, we see that it just calls Namespace/find which merely does a lookup in the namespaces map.

Parting thoughts

  1. If the purpose of *loaded-libs* is primarily to keep track of what namespaces have been loaded, does it really need metadata? Metadata doesn't affect the equality of symbols. Arguably, adding metadata to symbols in *loaded-libs* is a waste of memory.
  2. One interesting finding is that at the very heart of Clojure is a bit of mutable state. Keeping track of loaded libraries uses Clojure's concurrency utilities and persistent data structures, but namespaces rely on a Java concurrent collection.
  3. All new namespaces are initialised with a default set of import mappings, mostly classes from java.lang. This has two main implications:
    1. The only thing special about classes from java.lang is that their mappings are hard-coded. If a new java.lang class were to be added to Java, it won't get imported by default until RT.java is updated with a new mapping.
    2. Imports are mappings of symbols to class objects, and there is not a separate set of mappings for Clojure vars.
  4. Since Clojure keeps track of what's loaded in two different places, it's possible mess up the environment in strange ways. In particular, remove-ns does not clear a symbol from *loaded-libs*, meaning that it would be possible to get Clojure into a state where it thinks a namespace is loaded when actually it is not.

Footnotes

  1. Note that the names of the anonymous function classes can be different each time you compile.
  2. Unsurprisingly, symbols are immutable and annotating one with metadata generates a new symbol.
  3. In particular, it is a ConcurrentHashMap, a concurrency-friendly implementation of a map from Java. It is not a persistent data structure, but it does concurrent reads and limited concurrent writes.

Saturday, 18 January 2014

Dart vs. ClojureScript: first impressions

Update: I have spent some more time with both Dart and ClojureScript since I wrote this, and I have written more about this in Dart vs. ClojureScript: two weeks later.

Dart is a relatively new programming language created by Google for client-side web development. I've started looking into it a bit as I am planning a local Dart Flight School event as one of the organisers for GDG Houston. In particular, I spent about an hour going throught the Darrrt code lab. Having done this, I was curious to see how the same application coded in ClojureScript would compare.

First impressions

Let me start by saying I haven't done much JavaScript programming since about 1996, and I am generally more comfortable with the server side of web programming than the client side. This makes a big difference in how I approach Dart and ClojureScript, as I am relatively unfamiliar with the underlying platform. Someone with a lot of experience with JavaScript and client-side web programming may approach these languages differently.

Dart

My first impression of Dart is that it is very Java-like. It's features include:

  • static typing
  • object-oriented
  • Java-like syntax
  • generics
  • exceptions

One of the major ways in which Dart is different Java is that it has lexical closures and first-class functions, which are welcome additions. Additionally, I think Dart has pretty good documentation and a fairly decent standard library.

ClojureScript

This isn't my first time using ClojureScript; nonetheless, it's still different enough in tooling and language details that it takes me a little bit more time to get up and running compared to a traditional Clojure project. However, the biggest problem that I ran into isn't the language, but the library.

With Dart, things like making asynchronous HTTP requests and manipulating the DOM are baked right into the standard library. With ClojureScript, it's not quite as straightforward. Do I use JavaScript primitives, the Google Closure library, or look for a ClojureScript library that wraps Closure or raw JavaScript? In the end, I used ClojureScript libraries such as Domina and storage-atom.

Some comparisons

Even within this short code lab, there are a few places where the differences between the two languages was remarkable.

Serialisation

The code lab includes both storing bits of information in HTML5 local storage and loading data from an external file. In the case of Dart, I found it to be somewhat painful as it required transforming things into or from JSON.

For example, a Dart PirateName class requires the following JSON read/write code:

final String TREASURE_KEY = 'pirateName';

class PirateName {
  PirateName.fromJSON(String jsonString) {
    Map storedName = JSON.decode(jsonString);
    _firstName = storedName['f'];
    _appellation = storedName['a'];
  }

  String get jsonString => '{ "f": "$_firstName", "a": "$_appellation" }';
}

PirateName getBadgeNameFromStorage() {
  String storedName = window.localStorage[TREASURE_KEY];
  if (storedName != null) {
    return new PirateName.fromJSON(storedName);
  } else {
    return null;
  }
}

In comparison, with ClojureScript we can use the reader and a map instead of an object. Combining this with the storage-atom library makes reading from and writing to local storage trivial.

(def storage (local-storage (atom {}) :pirate-storage))

; write to local storage
(swap! storage assoc :pirate-name name)

; read from local storage
(:pirate-name @storage)

Likewise, in Dart, reading a set of names and appellations from an external file requires decoding JSON:

class PirateNames {
  static _parsePirateNamesFromJSON(String jsonString) {
    Map pirateNames = JSON.decode(jsonString);
    names = pirateNames['names'];
    appellations = pirateNames['appellations'];
  }
}

In Clojure, we can just use read-string.

Built-in functions

While Dart's library has a fairly decent set of functions for doing things like interacting with the DOM, it doesn't have some of the functionaly that ClojureScript has built-in.

For example, compare (rand-nth names) to:

final Random indexGen = new Random();

name = names[indexGen.nextInt(names.length)];
Asynchronous code

With ClojureScript, you can use core.async as described by David Nolen to write asynchronous code that reads logically. Dart has no comparable functionality (though it does have functionality to support asynchronous code in general). While using core.async for this code lab is probably unnecessary, I can definitely see how it could make a more complex application easier to understand and maintain.

Final thoughts

I can definitely see how Dart is a big improvement over JavaScript. Nonetheless, I think that ClojureScript is fundamentally a more powerful language. It's shortcomings in comparison to Dart seem to generally lie in terms of libraries, which can be easily written by the community. As such, while I have found Dart interesting, and will continue to learn more about it, at this time I would lean more towards using ClojureScript in a project.

Wednesday, 18 December 2013

Looking for Clojure Glass developers

I have recently become a Google Glass explorer. Naturally, as a Clojure developer, I have started looking into developing Glassware using Clojure.

Developing for Glass

There are two different APIs for writing Glassware:

  1. The Mirror API, which allows you to write applications for Glass via a RESTful API. As a result, this means you can easily get started developing for Glass using your existing development skillset.

  2. The Glass Development Kit (GDK) allows you to write applications that run on Glass itself. Glass runs a version of Android, and the GDK is a library that works with the existing Android development kit and tools. As a result, writing Glassware with Clojure and GDK is likely to run into the same caveats that come with Clojure/Android development.

I have started working on an idiomatic Clojure library that wraps the Mirror API, and I am working on using it in a Compojure-based web app that replicates the features of the standard Mirror Demo app published by Google. I haven't quite gotten far enough to publish anything quite yet.

Looking for more Clojure Glass Developers

I have been given a chance to hand out a few invitations for more Glass explorers, and I would love to hand them out to Clojure developers. If you'd like to get a chance to win an invitation, please leave a comment here letting me know what kinds of cool things you'd like to do with Glass. Be sure to include some sort of way for me to contact you. If you prefer to not post publicly, just send an e-mail to <clojure-glass-invite at deepbluelambda.org>.

The deadline for entries is 2013-12-21 06:00 UTC.

The fine print

Unfortunately, Google imposes a few requirements to be able to get an application. You must:

  1. Be US residents,
  2. Be at least 18 years old,
  3. Provide a US shipping address or pick up their Glass at one of our locations in New York, San Francisco or Los Angeles, and
  4. Be willing to spend $1500 for the device.

Wednesday, 20 November 2013

Programming Clojure with Vim (2013 edition)

It's been over three years since I wrote Programming Clojure with Vim, and a lot has changed since then, including:

  • Vim now bundles some static Clojure support (Yay!).
  • VimClojure development has ceased, now replaced by two new projects:
    1. vim-clojure-static, a fork of the static portion of VimClojure.
    2. vim-fireplace, a new dynamic back end for Clojure

How has my environment changed as a result? Quite a bit, in fact:

  • I have replaced VimClojure with vim-clojure-static and vim-fireplace.
  • I love my rainbow parentheses, so I am using rainbow_parentheses.vim.
  • I have finally started using paredit from slimv.

So, how is this working out for me?

The good stuff

Look, ma, no setup!

I think the best part of this new configuration is that it uses nREPL instead of VimClojure's nailgun-based protocol. This means that I'm just a lein repl away from having my dynamic Clojure support in Vim. I no longer need to fuss with lein-tarsier or create custom development scripts.

This is a big deal.

Gone over to paredit

So, I've started using paredit. It's long been a feature I've had some interest in, as all the Emacs users I know really love it. However, I could never get used to Emacs, and when I last reviewed Slimv's paredit support, I had complained about it being buggy.

Fast forward a few years, and it's gotten a lot better. I'm still a paredit newbie, so I'm not yet slurping and barfing s-expressions like it's second nature. Nonetheless, I can finally see why Emacs users love it so much.

paredit and LustyJuggler

There is one important caveat I have found for paredit: It seems to conflict with using LustyJuggler with the home row. I presume this is a result of paredit's normal mode key mappings, but I'm not sure that this is something that can be easily fixed from the paredit side without making drastic changes to the key mappings. If I have some time, I'll look into writing a patch for LustyJuggler. For the moment, a less than ideal workaround is using the number keys when juggling buffers.

What am I missing?

So, what is missing with these new plugins that I had before with VimClojure? Really, not much. The key mappings and functionality of vim-fireplace is certainly different than that of VimClojure, but most of what I need is there.

vim-redl, most likely

The one thing I think is noticeably not in vim-fireplace is VimClojure's ability to do completion and highlighting dynamically based on what's loaded in the REPL. vim-redl seems to have that functionality, but I haven't tried it.

My configuration

I now use Vundle to manage my Vim plugins. I had been using Pathogen for a long time, but as I have updated how I manage my home directory across machines, Vundle became a better fit. Here's an excerpt of my vimrc with just the Clojure-specific parts:

" Fireplace (Clojure support)
Bundle 'tpope/vim-fireplace'

" Rainbow parentheses
Bundle 'kien/rainbow_parentheses.vim'
"  Parentheses colours using Solarized
let g:rbpt_colorpairs = [
  \ [ '13', '#6c71c4'],
  \ [ '5',  '#d33682'],
  \ [ '1',  '#dc322f'],
  \ [ '9',  '#cb4b16'],
  \ [ '3',  '#b58900'],
  \ [ '2',  '#859900'],
  \ [ '6',  '#2aa198'],
  \ [ '4',  '#268bd2'],
  \ ]

" Enable rainbow parentheses for all buffers
augroup rainbow_parentheses
  au!
  au VimEnter * RainbowParenthesesActivate
  au BufEnter * RainbowParenthesesLoadRound
  au BufEnter * RainbowParenthesesLoadSquare
  au BufEnter * RainbowParenthesesLoadBraces
augroup END

" SlimV
Bundle 'kovisoft/slimv'

" vim-clojure-static
Bundle 'guns/vim-clojure-static'

As you can see, installation is straightforward. The trickiest bits are for rainbow parentheses, where I do two different things:

  1. I set up the parentheses colours to use the various highlight colours from Solarized.
  2. I enable rainbow parentheses for all buffers and for brackets ([]) and braces ({}) as well. You may want to limit this to Clojure buffers, but I haven't found it too intrusive in other files. To do this, just replace the * in the au commands with something like *.clj.

Tuesday, 19 November 2013

How Clojure works: a simple namespace

Have you ever wondered how Clojure works at runtime? Perhaps you've wondered how Clojure hot swaps code at the REPL or why Clojure seems to start up so slowly. Well, here is a chance to get to know a little more about how Clojure works under the covers. We'll start by examining how a namespace bootstraps itself.

A minimal example

Let's start with a minimal example and compile the following namespace:

(ns greeter.hello)

This should generate hardly any code, right? It does nearly nothing, after all. Well, if we look at the compiler output, we find three different classes have been generated:

  1. greeter.hello__init
  2. greeter.hello$loading__4910__auto__
  3. greeter.hello$fn__17

That seems like a lot. What's going on here? Let's take a closer look.

greeter.hello__init

greeter.hello__init is the class that bootstraps the greeter.hello namespace. Every time that you try to load a namespace, Clojure will look for an AOT-compiled class that matches the namespace with an __init suffix. By loading this class, the namespace itself is loaded. So, what does such a class look like? Using the javap decompiler, we come up with the following:

package greeter;
import clojure.lang.*;

public class hello__init {
  public static {};

  public static final Var const__0;
  public static final AFn const__1;
  public static final AFn const__2;

  public static void load();
  public static void __init0();
}

What are these constants? What are all of these methods? What's that weird nameless method?

The nameless method is a static initializer, which contains code that will run when the class itself is loaded. Clojure relies on a static initialiser (line 5) to do the actual work of bootstrapping the namespace, so let's dissassemble the class and take a closer look at what is going on. This will also help us find out what these constants and static methods do.

The static initialiser

The javap tool will output the disassembled code for the initialiser, and it looks something like this:

public static {};
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=1, locals=0, args_size=0
         0: invokestatic  #75                 // Method __init0:()V
         3: ldc           #77                 // String greeter.hello__init
         5: invokestatic  #83                 // Method java/lang/Class.forName:(Ljava/lang/String;)Ljava/lang/Class;
         8: invokevirtual #87                 // Method java/lang/Class.getClassLoader:()Ljava/lang/ClassLoader;
        11: invokestatic  #93                 // Method clojure/lang/Compiler.pushNSandLoader:(Ljava/lang/ClassLoader;)V
        14: invokestatic  #95                 // Method load:()V
        17: invokestatic  #98                 // Method clojure/lang/Var.popThreadBindings:()V
        20: goto          27
        23: invokestatic  #98                 // Method clojure/lang/Var.popThreadBindings:()V
        26: athrow
        27: return
      Exception table:
         from    to  target type
            14    17    23   any

This is pretty cryptic, and from now on I'll skip the bytecode and just present Java code that reflects what's going on, such as:

static {
  __init0();
  ClassLoader loader = Class.forName("greeter.hello__init").getClassLoader();
  clojure.lang.Compiler.pushNSandLoader(loader);
  try {
    load();
  } finally {
    clojure.lang.Var.popThreadBindings();
  }
}

We'll see what the __init0 and load() methods do below. The Compiler.pushNSandLoader() and Var.popThreadBindings() calls create a binding context, roughly equivalent to the following Clojure code:

(bind [clojure.core/*ns*         nil
       clojure.core/*fn-loader*  loader
       clojure.core/*read-eval*  true]
  (greeter.hello_init/load))

At this point, this is not too interesting. We just see that:

  • *ns* has been nulled out,
  • The namespace's class loader is available via *fn-loader*, and
  • *read-eval* has been set to true.

Let's see what __init0 does next.

__init0()

You probably noticed that greeter.hello__init has a few class constants. These constants are identified by the Clojure compiler as values that are used within the class that only need to be looked up once. In a very complex namespace, such as clojure.core, there can be thousands of these constants1. Once it has collected all of these constants, the compiler generates one __init method per 100 constants.

For our namespace, we only have three constants and the one __init0 method, and the code looks something like this:

static void __init0() {
  const__0 = (Var)RT.var("clojure.core", "in-ns");
  const__1 = (AFn)Symbol.intern(null, "greeter.hello");
  const__2 = (AFn)Symbol.intern(null, "clojure.core");
}

So, we need the var clojure.core/in-ns and two symbols2. Why these in particular? Well, keep in mind that ns is actually a macro, and expand it once:

(do
  (clojure.core/in-ns (quote greeter.hello))
  (clojure.core/with-loading-context (clojure.core/refer (quote clojure.core)))
  (if (.equals (quote greeter.hello) (quote clojure.core))
    nil
    (do
      (clojure.core/dosync (clojure.core/commute (clojure.core/deref (var clojure.core/*loaded-libs*))
                                                 clojure.core/conj
                                                 (quote greeter.hello)))
      nil)))

That helps clarify things somewhat, but why are only in-ns, greeter.hello, and clojure.core saved as constants and not the rest? Well, taking a look at the full macro-expansion tells us why:

(do
  (clojure.core/in-ns (quote greeter.hello))
  ((fn* loading__4910__auto__
        ([]
         (. clojure.lang.Var (clojure.core/pushThreadBindings {clojure.lang.Compiler/LOADER (. (. loading__4910__auto__ getClass) getClassLoader)}))
         (try
           (clojure.core/refer (quote clojure.core))
           (finally
             (. clojure.lang.Var (clojure.core/popThreadBindings)))))))
  (if (. (quote greeter.hello) equals (quote clojure.core))
    nil
    (do
      (. clojure.lang.LockingTransaction (clojure.core/runInTransaction (fn*
                                                                          ([]
                                                                           (clojure.core/commute (clojure.core/deref (var clojure.core/*loaded-libs*))
                                                                                                 clojure.core/conj
                                                                                                 (quote greeter.hello))))))
      nil)))

Wow. That's ugly; but we're not here to look at beautiful Clojure code—we're here to see how beautiful Clojure code really works. We see that with-loading-context and dosync do some of their work using anonymous functions. This explains the origin of the mysterious greeter.hello$fn__17 and greeter.hello$loading__4910__auto__ classes and why greeter.hello__init has relatively few constants.

Well, that's it for __init0, and there's just one method left to look at in greeter.hello__init, load().

load()

Having seen the macro-expansion of (ns greeter.hello), we can get a pretty good idea of what to expect in load(). Let's take a look:

public static void load() {
  // (in-ns 'greeter.hello)
  IFn inNs = (IFn)const__0.getRawRoot();
  inNs.invoke(const__1);

  // (with-loading-context (refer 'clojure.core))
  IFn loading4910auto = (IFn)new greeter.hello$loading__4910__auto();
  loading4910auto.invoke();

  // (if (.equals 'greeter.hello 'clojure.core)
  //   nil
  //   (do
  //     (LockingTransaction/runIntransaction (fn* …))
  //     nil))
  Symbol greeterHello = (Symbol)const__1;
  if (greeterHello.equals(const__2)) {
    return null;
  } else {
    Callable callable = (Callable)new greeter.hello$fn__17();
    LockingTransaction.runInTransaction(callable);
    return null;
  }
}

Unsurprisingly, the first thing that happens is that (in-ns 'greeter.hello) is run. This will create the greeter.hello namespace and set *ns* to point to it. The next couple of lines perform (refer 'clojure.core), but do so in an anonymous function. Finally, the last part of the function checks to see if greeter.hello is the same as clojure.core. If not, it instantiates the second anonymous function, which will ensure that Clojure knows that the namespace has been loaded.

And that's all the greeter.hello__init class does, so now let's take a closer look at those anonymous functions.

greeter.hello$loading__4910__auto__

Remember that greeter.hello$loading__4910__auto__ is the expansion of (with-loading-context (refer 'clojure.core)), which is essentially:

  (fn* loading__4910__auto__
   ([]
    (clojure.lang.Var/pushThreadBindings {clojure.lang.Compiler/LOADER (.getClassLoader (.getClass loading__4910__auto__))})
    (try
     (refer 'clojure.core)
     (finally
      (clojure.lang.Var/popThreadBindings)))))

So, what does the class signature look like? Let's see:

package greeter;
import clojure.lang.*;

public final class hello$loading__4910__auto__ extends AFunction {
  public static final Var const__0;
  public static final AFn const__1;

  public static {};
  public hello$loading__4910__auto__();

  public Object invoke();
}

Right off the bat, we some similarities to our namespace class, namely the class constants and the static initialiser. Let's check out the decompiled results.

The static initialiser

As expected, the static initialiser does bear a great resemblance to the __init0 method we saw above:

public static {
  const__0 = (Var)Rt.var("clojure.core", "refer");
  const__1 = (AFn)Symbol.intern(null, "clojure.core");
}

We can surmise that this is a common technique by the Clojure compiler. The big difference between function classes and namespace classes appears to be that function classes instantiate their constants right in the static initialiser instead of delegating to a helper method.

All of this takes place once, when the class is loaded. However, each time the function is used, a new object will have to be constructed.

Constructor

It turns out that the constructor is fairly trivial:

public hello$loading__4910__auto__() {
  super();
}

This most likely would not have been the case had the function been a closure. In such a case, the closed over values would have been arguments to the constructor and there would have been corresponding fields.

Finally, let's get to the juicy part, invoke().

invoke()
public Object invoke() {
  // (Var/pushThreadBindings {Compiler/LOADER (.getClassLoader (.getClass loading__4910__auto__))})
  Object[] bindings = new Object[2];
  bindings[0] = Compiler.LOADER;
  bindings[1] = this.getClass().getClassLoader();
  Var.pushThreadBindings((Associative)RT.mapUniqueKeys(bindings));

  try {
    // (refer 'clojure.core)
    IFn refer = (IFn)const__0.getRawRoot();
    return refer.invoke(const__1);
  } finally {
    Var.popThreadBindings();
  }
}

The content is more or less what you'd expect. The only really interesting part is how the map for the thread bindings is created. It calls the variadic RT.mapUniqueKeys utility method which will, depending on the number of arguments given, return a singleton empty map, a PersistentArrayMap instance, or a PersistentHashMap instance.

And that's that for greeter.hello$loading__4910__auto__. We can finally look at the last anonymous function.

greeter.hello$fn__17

This anonymous function holds some code that needs to run in a transaction, as seen in the snippet below:

(dosync (commute (deref #'clojure.core/*loaded-libs*)
                 conj
                 'greeter.hello))

This is modifying the global *loaded-libs* var to let Clojure know that the greeter.hello namespace has been loaded. The expansion of this code looks like:

(LockingTransaction/runInTransaction (fn* ([] (commute (deref #'clojure.core/*loaded-libs*)
                                                       conj
                                                       'greeter.hello))))

This anonymous function holds the commute invocation, so let's see what this function's class looks like:

package greeter;
import clojure.lang.*;

public final class hello$fn__17 extends AFunction {
  public static final Var const__0;
  public static final Var const__1;
  public static final Var const__2;
  public static final Var const__3;
  public static final AFn const__4;

  public static {};

  public hello$fn__17();

  public Object invoke();
}

There is nothing surprising here. We just have a few more constants than in our previous examples.

The static initialiser and constructor

By now, the contents of the static initialiser should be predictable, can you guess what each field will be?

public static {
  const__0 = (Var)RT.var("clojure.core", "commute");
  const__1 = (Var)RT.var("clojure.core", "deref");
  const__2 = (Var)RT.var("clojure.core", "*loaded-libs*");
  const__3 = (Var)RT.var("clojure.core", "conj");
  const__4 = (AFn)Symbol.intern(null, "greeter.hello");
}

Likewise, the constructor is trivial:

public hello$fn__17() {
  super();
}

Let's see if invoke() is any more interesting.

invoke()

The heart of any Clojure function class is its invoke() method, what does this one look like?

public Object invoke() {
  IFn commute = (IFn)const__0.getRawRoot();
  IFn deref = (IFn)const__1.getRawRoot();
  // (deref #'clojure.core/*loaded-libs*)
  Object loadedLibs = deref.invoke(const__2);
  Object conj = const__3.getRawRoot();
  // (commute loadedLibs conj 'greeter.hello)
  return commute.invoke(loadedLibs, conj, const__4);
}

Again, nothing new. Each var's value is retrieved and the deref and commute functions are invoked.

Parting thoughts

I went through this exercise in an attempt to get a better understanding of Clojure's runtime. As we can see, just loading a namespace involves:

  • Loading three classes, two of which may never need to be reused
  • Instantiating at least a half dozen objects
  • Getting references to vars and symbols, at least once for each class that uses it
  • Getting the value of the var each time it used

While this seems like a lot of overhead, the fact of the matter is that this enables Clojure's dynamic runtime environment. Without it, there would be no REPL or REPL-driven development.

However, sometimes you don't want all of that. A dynamic runtime is great for development, but some production environments have constraints that make this flexibility a bad trade-off. It's been a long-standing desire of mine to have a leaner Clojure runtime, and I believe it's important to understand how the current runtime works.

In a future entry, I plan to examine the impact of adding a var to the namespace.

Footnotes

  1. A quick glance at clojure.core reveals 2342 constants and 24 __init functions.
  2. It's somewhat curious that symbols are stored as abstract functions.

Monday, 12 December 2011

Clojure REPL Tip: Loading Scripts

Ever wanted to write a script once and then run it using the Clojure REPL for Android? I was recently asked about this via e-mail, and I thought it would be good to make a note of it here.

First, you will need to write your script and get it onto your phone in a world-readable location. Usually, the SD card should work out well enoughi, which is generally mounted at /sdcard. Therefore, if you transfer your script coolscript.clj to the SD card, it should show up at /sdcard/coolscript.clj.

Now, all you need is the load-file command. To run the script above, you would just enter:

(load-file "/sdcard/coolscript.clj")

Limitation

Note there is one important limitation to this approach. It does not change the path for which scripts are looked up. If your script requires searching for other files that define namespaces, it generally will not work. Adding directories to the search path is proposed feature that would be nice to add.

Behind the scenes

I have not abandoned the REPL. Since the Conj, I have been working on the REPL to help get some more feedback to the Clojure/dev team to improve Clojure on Android. At the Conj, I announced Neko, the Clojure/Android Toolkit. I have since updated it to be compatible with the latest versions of both Clojure and the Android SDK. I also now have a version of Clojure for Android that is compatible with the latest 1.4 developments.

Please stay tuned for more Clojure/Android news and a Clojure/conj wrap-up.

Tuesday, 05 April 2011

Decaffeinated Robot: source, slides, and audio

This past weekend the second Texas Linux Fest was held in Austin, Texas. I had a great time attending the various sessions and the expo space. I am also glad to have had the opportunity to speak about alternative approaches to Android development. Mobile development is certainly a hot topic these days, and I have been told that the room overflowed into the hallway for my talk.

As promised, here are the slides, source code, and audio from my presentation:

I would like to thank the TXLF organisers for a great event. I am looking forward to TXLF 2012.

Sunday, 27 February 2011

Clojure for Android source published

Over the next few weeks, I will be publishing the source code for the Clojure REPL for Android in a few different instalments:

  1. Clojure for Android, a modified version of Clojure adapted to run on the Dalvik virtual machine;
  2. Clojure Android Toolkit, a library of utilities for Clojure developers working on Android; and
  3. Clojure REPL, the source code of the application itself.

I have now published the modifications to my source code in a repository available on GitHub. My work is based on the 1.2.x branch of the Clojure source code and is available in the android-1.2.x branch.

This post will document my goals for Clojure on Android, give an overview of the changes I have made, describe the current implementation of dynamic compilation, and present areas for future work.

Goals

The three primary goals of the Clojure for Android release are as follows:

  1. Create a version of Clojure that works for both the Java and Dalvik virtual machines in the hope that the changes can eventually be included in Clojure itself,
  2. Create a development version of Clojure that supports dynamic compilation to enable more rapid development of applications, and
  3. Create a lean Clojure runtime that will deliver acceptable performance on Android devices.

Overview of changes

There really are not many changes in this initial release of Clojure for Android. They fall into three categories: the addition of a Dalvik-specific dynamic class loader, some minor runtime changes, and an update to the build configuration to support Android.

New DynamicClassLoader hierarchy

The most significant change is to DynamicClassLoader. In the original implementation, this class manages a constant pool, maintains a cache of class definitions, provides (deprecated) class path alteration capability, and is in charge of turning compiled class byte codes into classes available within the virtual machine.

In my implementation, it retains all of those abilities save for the last one. It is now an abstract class that delegates class realisation to its subclasses, of which there are two:

  1. JvmDynamicClassLoader relies on Java's standard ClassLoader.defineClass method.
  2. DalvikDynamicClassLoader uses a tortuous method, described later.
Runtime changes

There are a few relatively minor runtime changes:

  • The addition of a new var, clojure.core/*vm-type*, which will be set to either :dalvik-vm or :java-vm at runtime.
  • Choosing the correct DynamicClassLoader implementation depending on *vm-type*
  • A workaround for a bug fixed in ‘FroYo’ where the context class loader is set to a system class loader instead of the application’s class loader
  • The pre-emptive loading of clojure.set, clojure.xml, and clojure.zip is disabled on Dalvik.

That’s it.

Build system update

The build system has received somewhat more extensive changes. There are two basic scenarios:

  1. Building Clojure without Android support: This should work just fine. Just run ant as usual.
  2. Building Clojure with Android support: You will need to create a local.properties file with pointers to the Android SDK directory and SDK version you want to use. More documentation is available in readme.txt.

When building with Android support, the build file will create a stripped down version of the dx.jar file from the Android SDK. By default, this will do a simplistic removal of purely test classes. However, if you have ProGuard, it can do a more exhaustive shrinking. This is enabled by setting the proguard.jar property.

When Android is enabled, the build will create two additional JAR files:

  1. clojure-nosrc.jar, the opposite of clojure-slim.jar, a compiled-only version of Clojure. This JAR also contains the dx tool classes that are needed at runtime.
  2. clojure-dex.jar, a version of clojure-nosrc.jar where all of the classes have been compiled into a Dalvik executable. This file is suitable for loading by one of Android’s class loading mechanisms.

Dynamic compilation

To illustrate how I implemented dynamic compilation in Clojure, I will first present the traditional path from compiled Java class to instantiated Dalvik class. Next, I will show how the modified version of Clojure takes dynamically generated classes through the same process. Finally, I will present the trade-offs involved in the current implementation and what you should keep in mind when using the dynamic compilation.

Traditional work flow

The following is a brief description of the path of a compiled class file from build to execution:

  1. At build time:
    1. Java files are compiled into Java classes made up of JVM byte codes.
    2. All of the classes are prepared into a Dalvik Executable (DEX file) by the dx tool. This file is called classes.dex.
    3. The DEX file is placed into the Android package.
  2. At install time:
    1. The installer reads the DEX file from the package.
    2. The DEX file is verified to remove illegal instructions and performs some computations to aid in garbage collection.
    3. The verified DEX data is then optimized, creating a hardware- and platform-specific version of the code. Some optimizations include replacing virtual method call resolution with indices in a vtable, inlining method calls, pruning empty methods, etc.
    4. The resulting optimised DEX file (ODEX file), is written to a special cache directory.
  3. At run time:
    1. The ODEX file is checked to make sure it is still valid. If not, then the original DEX file is again verified and optimised.
    2. The application loads its classes from the ODEX file.
Dynamic Clojure work flow
  1. The Clojure evaluator compiles a form into a class using the embedded ASM bytecode engineering library.
  2. The DalvikDynamicClassLoader processes the compiled byte code as follows:
    1. It uses the embedded dx tool to translate the JVM class into an in-memory DEX file.
    2. It writes the DEX file into a temporary JAR file in *compile-path*.
    3. It uses Android’s dalvik.system.DexFile to load the JAR file. In doing so, Android will create an ODEX file in *compile-path*.
    4. Loads the class from the DexFile object and returns it.
Trade-offs

The main disadvantage to this form of dynamic compilation is that it is slow. It requires using the disk, as well as performing all sorts of computations at runtime. Anyone who has used the Clojure REPL for Android can attest to its sluggishness.

Unfortunately, to the best of my knowledge, there are no other accessible APIs available for doing this better. Most of the work is done in native code, making it difficult to bundle it into Clojure. There is some hope that this may change in the future. From the Dalvik documentation:

Some languages and frameworks rely on the ability to generate bytecode and execute it. The rather heavy dexopt verification and optimization model doesn't work well with that.

We intend to support this in a future release, but the exact method is to be determined. We may allow individual classes to be added or whole DEX files; may allow Java bytecode or Dalvik bytecode in instructions; may perform the usual set of optimizations, or use a separate interpreter that performs on-first-use optimizations directly on the bytecode

Until such an API is released, it is necessary to either take the slow but simple route, or to create a Clojure compiler for the Dalvik VM from scratch.

I think that dynamic compilation is of interest primarily to developers. Most applications will have no need for dynamic compilation as they can be AOT-compiled. As such, the slowness may well be acceptable. After all, waiting seconds for a function to recompile in your running application is much more tolerable than needing to go through a full compile-deploy cycle that may be measured in minutes.

Caveats

There are two things to be aware of when using dynamic compilation:

  1. You will need to be sure to point *compile-path* to some place where your application has write access.
  2. Some forms may blow the stack during compilation, such as (for [x (range 5) y (range 5)] [x y]). This is a limitation of the runtime.

Future work

There is still much to be done. As the source is now released, I look forward to seeing what sorts of feedback and improvements will come from others. Given the three goals I stated above, I think much of the work is as follows:

Integration with upstream

Get feedback on how to best integrate these changes into the language from other Clojure developers in general and, I hope, the Clojure/core team itself. Of course, it will most likely take some time before these changes make it into a Clojure release.

Clojure for Android development

While not perfect, I think the current solution largely satisfies this goal. Writing a new compilation back-end may make things better, but I am not sure that it will provide as good a return as working on the third goal.

I think that any new development should follow the master branch of Clojure going forward. The patches to the code itself should be simple enough to manage. Porting the build system changes to Maven will be more cumbersome.

Lean Clojure runtime

This is the place where the most work needs to be done. There have already been some good ideas presented on how to improve this, such as:

  • Eliminating metadata from compiled code to reduce memory footprint,
  • Finding ways to cut down on the immense amount of object churn during bootstrap, and
  • Generally finding ways to cut down on the amount of work done during the bootstrapping process.

My current idea is to find ways to modularise clojure/core.clj somewhat to be able to either completely eliminate some functionality (such as those for dynamic compilation) or at least delay loading it. Not every program makes use of every feature of the language. Some programs may never use one or more of: agents, futures, primitive vectors and arrays, etc. If there were some way to make some of these things load-on-demand, if only in an Android environment, that could significantly improve bootstrap times.

I look forward to the feedback from others and welcome any help in trying to get these things working. I think that it is quite possible to make Clojure a first-class development language for the Android platform.