Saturday, 04 February 2017

Know your keywords

Namespace-qualified keywords have existed since the beginning of Clojure, but they have seen relatively little use. However, Clojure 1.9 will introduce spec, and spec uses this feature quite heavily. Now that this once-obscure feature is getting some real attention, it has led to a lot of confusion. Do you know the difference between :foo, ::foo, ::bar/foo, and :bar/foo? If not, I hope you will by the end of this post.

First, a refresher on namespace-qualified symbols

Keywords, like symbols, can be qualified with a namespace. As a Clojure developer, you probably have some experience with namespace-qualified symbols. For example, you may have used something like:

(ns example
  (:require [clojure.string :as str]))

(defn print-items
  [items]
  (println (str "The items are: " (str/join ", " items))))

On line 2 of this example, we require the namespace clojure.string and alias it to str. As a result, on line 6, Clojure will see str/join and resolve the namespace alias and interpret the symbol as clojure.string/join.

Note that Clojure treats the str on its own differently than the str before the slash in str/join. The str by itself is looked up within the scope of the current namespace, where it has been referred to clojure.core/str (you can see all referred symbols by running (ns-refers *ns*). In contrast, str/join is a namespace-qualified symbol, so Clojure looks for a join in the str namespace, which is aliased to clojure.string.

However, you do not have to use aliases, you can write the following equivalent code:

(ns example
  (:require [clojure.string]))

(defn print-items
  [items]
  (println (str "The items are: " (clojure.string/join ", " items))))

In this snippet, we provide the full namespace and there is no need to look up an alias. However, had we just used str/join, Clojure would produce the error: No such namespace: str.

How symbols and keywords are different

In the last section, we saw how Clojure treats namespace-qualified symbols. The fact is that it treats namespace-qualified keywords like symbols, with two major differences:

  1. Keywords evaluate to themselves, whereas symbols must resolve to a value.
  2. Only keywords that begin with a double colon participate in namespace auto-resolution.

Let’s examine each of these in turn.

Keywords evaluate to themselves

This means that a Clojure keyword can have any namespace we want, it does not have to exist because it does not need to point to anything.

(ns example
  (:require [clojure.string :as str]))

(prn :foo)                    ;; :foo
(prn :example/foo)            ;; :example/foo
(prn :clojure.string/foo)     ;; :clojure.string/foo
(prn :clojure.is.awesome/foo) ;; :clojure.is.awesome/foo
(prn :str/foo)                ;; :str/foo

In line 4, we see that omitting the namespace gets us what we are used to: :foo evaluates to :foo. However, as we see in lines 5–7, we can add a namespace, which results in having a keyword with that namespace, regardless of whether that namespace exists or not—the keywords merely evaluate to themselves. However, note that on line 8 :str/foo does not evaluate to :clojure.string/foo. If we want that feature, we need use a double-colon keyword.

Double-colon keywords: keywords with namespace resolution

Keywords that start with two colons instead of just one participate in namespace resolution, much like symbols:

(ns example
  (:require [clojure.string :as str]))

(prn ::str/foo)                ;; :clojure.string/foo
(prn ::clojure.string/foo)     ;; :clojure.string/foo

(prn ::awesome/foo)            ;; ERROR: Invalid token: ::awesome/foo
(prn ::clojure.is.awesome/foo) ;; ERROR: Invalid token: ::clojure.is.awesome/foo

Lines 4 and 5 above produce the same output. In the case of line 4, Clojure finds that str is an alias to clojure.string, so the result is the namespace-qualified keyword :clojure.string/foo (note there is only a single colon). On line 5, clojure.string itself is a namespace, so no additional resolution is needed and the result is :clojure.string/foo (again, one colon). Note that lines 6 and 7 both produce Invalid token errors because neither awesome or clojure.is.awesome are valid namespace names or aliases.

One last thing about double-colon keywords: when you use them without specifying a namespace, they evaluate to a namespace-qualified keyword with the current namespace:

(ns example)

(prn :example/foo)   ;; :example/foo
(prn ::example/foo)  ;; :example/foo
(prn ::foo)          ;; :example/foo

As we can see in lines 3–5, within the example namespace, :example/foo, ::example/foo, and ::foo all evaluate to the same thing.

Thursday, 18 September 2014

Introducing a new core.async web resource

Over the past few months I have been working on creating some resources to help people learn to use core.async. My goal is make this the best resource available to help people get started with core.async and to document best practices for composing applications with core.async.

It is still a work in progress, but it currently includes the following:

  1. An introduction to channels, buffers, and basic channel operations
  2. A tutorial/workshop that introduces core.async using both Clojure and ClojureScript
  3. A reference section of the core.async API, including both the Clojure and ClojureScript sources

Please check it out at www.core-async.info.

I will continue to add more reference and tutorial material in the future. Please let me know if there is anything you would think would be useful to add.

Wednesday, 20 August 2014

liberator-transit 0.3.0

I have released liberator-transit 0.3.0. The main new feature in this release is the option to set write handlers for Transit. To do so, just add them to the options using the :handlers key just as if you were calling transit/writer. For more details, check out the README.

Sunday, 10 August 2014

liberator-transit 0.2.0

It didn’t take long. I have released liberator-transit version 0.2.0, which fixes one bug and adds some configurability. Now, the response will not have a charset added to the Content-Type header. Furthermore, you can now disable verbose JSON output, make verbose JSON the default, and change the size of the initial buffer used in serialising the response. You can read the details in the README. Most importantly, this work paves the way for setting write handlers for the Transit output, currently planned for release 0.3.0.

Friday, 08 August 2014

Presenting liberator-transit

As part of my prepartion for the core.async workshop I’ll be giving at Strange Loop, I have come across the need for a RESTful API for a web application I am creating. Naturally, I thought this would be a good chance to try out Liberator and Transit. I was pleased to see how easy Liberator is to use. Moreover, it was nearly trivial to add Transit serialisation support to Liberator. Nonetheless, I thought it would be good to wrap it all up in a library ready for reuse, which I have unimaginatively called liberator-transit.

How to use it

Using liberator-transit is straightforward:

  1. Add it to your project as a dependency.
  2. Require the io.clojure.liberator-transit namespace
  3. Accept the application/transit+json and/or application/transit+msgpack media types.

For example, given the following project.clj:

(defproject liberator-transit-demo "0.1.0-SNAPSHOT"
  :dependencies [[org.clojure/clojure "1.6.0"]
                 [compojure "1.1.8"]
                 [com.cognitect/transit-clj "0.8.247"]
                 [liberator "0.12.0"]
                 [ring/ring-core "1.3.0"]
                 [io.clojure/liberator-transit "0.1.0"]]
  :plugins [[lein-ring "0.8.11"]]
  :ring {:handler liberator-transit-demo.core/app})

And the given liberator_transit_demo/core.clj:

(ns liberator-transit-demo.core
  (:require [compojure.core :refer [ANY defroutes]]
            [io.clojure.liberator-transit]
            [liberator.core :refer [defresource]]
            [ring.middleware.params :refer [wrap-params]]))

(defresource hello [name]
  :available-media-types ["text/plain"
                          "application/edn"
                          "application/json"
                          "application/transit+json"
                          "application/transit+msgpack"]
  :handle-ok {:time (System/currentTimeMillis)
  :greeting (str "Hello, " name \!)})

(defroutes routes
  (ANY "/:name" [name] (hello name)))

(def app (wrap-params routes))

You can run the server using lein-ring:

$ lein ring server-headless
2014-08-08 23:12:23.330:INFO:oejs.Server:jetty-7.6.8.v20121106
2014-08-08 23:12:23.349:INFO:oejs.AbstractConnector:Started SelectChannelConnector@0.0.0.0:3000
Started server on port 3000

Now, it is possible to see what the different encodings look like:

$ curl http://localhost:3000/text
time=1407557612822
greeting=Hello, text!
$ curl -H "Accept: application/edn" http://localhost:3000/edn
{:time 1407558307759, :greeting "Hello, edn!"}
$ curl -H "Accept: application/json" http://localhost:3000/json
{"time":1407558370116,"greeting":"Hello, json!"}
$ curl -H "Accept: application/transit+json" http://localhost:3000/transit-json
["^ ","~:time",1407558488590,"~:greeting","Hello, transit-json!"]
$ curl -H "Accept: application/transit+json;verbose" http://localhost:3000/transit-json-verbose
{"~:time":1407558554647,"~:greeting":"Hello, transit-json-verbose!"}
$ curl -s -H "Accept: application/transit+msgpack" http://localhost:3000/transit-msgpack | xxd
0000000: 82a6 7e3a 7469 6d65 cf00 0001 47b9 08d1  ..~:time....G...
0000010: 52aa 7e3a 6772 6565 7469 6e67 b748 656c  R.~:greeting.Hel
0000020: 6c6f 2c20 7472 616e 7369 742d 6d73 6770  lo, transit-msgp
0000030: 6163 6b21                                ack!

There’s just a couple of things to note:

  1. The default Transit/JSON encoding is the non-verbose format. By adding the “verbose” to the “Accept” header, liberator-transit emits the verbose JSON encoding.
  2. Since the MessagePack encoding is a binary format, I pipe the output through xxd. Otherwise, unprintable characters are output.

How it works

Liberator has a built-in mechanism for formatting output of sequences and maps automatically depending on the headers in the request, as seen above. Moreover, it’s possible to extend this easily by adding new methods to the render-map-generic and render-seq-generic multimethods in liberator.representation. These methods dispatch on the media type that has been negotiated and take two arguments: the data to render and the context.

Getting the verbose output to work was just a touch trickier. The parameters to the media type are stripped by Liberator. As a result, it is necessary to actually examine the request headers that were placed in the Ring map. Fortunately, this is easy to do as it is a part of the incoming context.

Here is an example:

(defmethod render-map-generic "application/transit+json"
  [data context]
  (let [accept-header (get-in context [:request :headers "accept"])]
    (if (pos? (.indexOf accept-header "verbose"))
      (render-as-transit data :json-verbose)
      (render-as-transit data :json))))

Further work

This library is really quite simple. I spent far more time creating test.check generators for the various Transit types than I did on the library itself. It mostly exists to provide an even easier way to add Transit to Liberator.

Nonetheless, if other people find a need for it, there is possibly room for improvement, such as:

  1. Is there a better way to handle the request for verbose JSON?
  2. Should the library be configurable? If so, what should be configured and what is the best way to do it?

Monday, 04 August 2014

Six tips for using Polymer with ClojureScript

Over the past month or so, I have started dabbling a bit with writing code for the web/front-end. In large part, I have done so in an effort to have a nice interface for the workshop on core.async that gave at Lambda Jam and will give again at Strange Loop. I am comfortable enough picking up what I need to know for the back end, but how can I create a site that doesn’t look like it came from 1997? I have used Twitter’s Bootstrap before, and it’s not a bad starting point. However, this time I decided to try using Polymer.

In this entry, I briefly introduce the motivating principle behind Polymer, and I share some of the lessons I learned in creating Polymer elements with ClojureScript. Finally, I answer the question: Why use Polymer and not something like Om/React?

A quick introduction to Polymer

Polymer is a technology developed by Google built atop Web Components, a group of related standards such as templates, HTML imports, custom elements, and shadow DOM. Ultimately, the end goal of this combine chunks of styling, markup, and logic into discrete reusable components. For example, embedding a Google map into a page is as simple as:

<google-map latitidute="…" longitude="…"></google-map>

It couldn’t get much simpler than that, right? Well, Polymer has a whole series of UI components designed to help create web applications implementing Google’s new material design. These looked nice, so I decided to put them to use.

Using Polymer with ClojureScript

Overall, using Polymer has been a relatively smooth process. For my workshop, I primarily just used the Polymer element libraries, but I did create a few. I primarily used JavaScript to implement the logic behind my elements. However, there is one component where I absolutely needed to use ClojureScript, as it demonstrates the different types of buffers for core.async channels. Rather than creating a simulation, I used core.async itself. Learning to use the two technologies together correctly took a bit of experimentation, but here are a few tips for you should you decide to go down the same path:

Tip 1: Beware of what you build in your prototype

The typical call to register a Polymer element with a few properties in ClojureScript will look something like:

(js/Polymer
  "my-element-name",
  #js {:foo false
       :bar 42
       :baz "ClojureScript rocks!"})

However, you may want to create an element with more complex properties such as JavaScript arrays or objects or even ClojureScript objects such as a core.async channel for managing callbacks. As such, you may be tempted to do something like:

(js/Polymer
  "my-element-name",
  #js {:anArray #js [1 2 3]
       :anObject #js {:foo 1 :bar 2}
       :channel (async/chan)})

However, as the documentation will warn you, this is not the right way to do it. The object created here is a prototype that will be used in each instance of your element. As such, any complex types will not get deep copies and essentially constitute global shared state. Instead, you need to instantiate these in the created callback, as in:

(js/Polymer
  "my-element-name",
  #js {:anArray nil
       :anObject nil
       :channel nil
       :created #(this-as me
                   (aset me "anArray" #js [1 2 3])
                   (aset me "anObject" #js {:foo 1 :bar 2})
                   (aset me "channel" (async/chan)))})
Tip 2: Pass your element reference around

If you don’t grab onto the this reference inside the scope of your prototype, it can be very hard to get it later. As such, when calling your own functions from a callback, be sure to pass the reference in. For example, the previous example could be rewritten as:

(defn on-create [element]
  (doto element
    (aset "anArray" #js [1 2 3])
    (aset "anObject" #js {:foo 1 :bar 2})
    (aset "channel" (async/chan))))

(js/Polymer
  "my-element-name",
  #js {:anArray nil
       :anObject nil
       :channel nil
       :created #(this-as me (on-create me))})
Tip 3: Use the right property accessors/mutators

This isn’t a Polymer-specific tip, but I got bit by it nonetheless. As you may know, ClojureScript supports two different syntaxes for accessing and mutating JavaScript object properties:

; Syntax one
(.-prop obj)
(set! (.-prop obj) value)

; Syntax two
(aget obj "prop")
(aset obj "prop" value)

When using ClojureScript’s advanced compilation mode, it will munge the property names if you use the first method. The result will be a broken element featuring lots of null or undefined values. Unless you are sure you will never use the advanced compilation mode, just use the second syntax.

Tip 4: Accept mutability

As a Clojure/ClojureScript programmer, it maybe very tempting to try to store your application state in an atom and deal with it a nice, sane way—just the way Om does it. Well, to do so in Polymer while supporting Polymer’s two-way data binding is just a pain. It can certainly be done, but I am not sure it is worth the effort. Let’s take a look at a simple example to see what I am talking about:

(js/Polymer
  "my-element",
  #js {:property "foo"
       :propertyAtom nil
       :propertyChanged #(this-as me
                           (reset! (aget me "propertyAtom") %2))
       :create #(this-as me
                  (let [property-atom (atom (aget me "property"))]
                    (add-watch property-atom ::to-polymer
                      (fn [_ _ _ new-val]
                        (when (not= new-val (aget me "property"))
                          (aset me "property" new-val))))
                    (aset me "propertyAtom" property-atom)))})

What does this code all do? Well, let’s break it down:

  1. On line 3, we set up the mutable cell that will be visible to Polymer.
  2. On line 4, we set up the immutable cell for our Clojure code.
  3. On line 5, we set up a handler so that we can update the Clojure atom when the property is changed from Polymer. For example, this may occur if the property was bound to the value of some user input.
  4. On line 8, we create our atom.
  5. On line 9, we set up a watcher on the atom so that if a change comes from the Clojure side, i.e. from a swap!, the change will be pushed into the mutable cell that Polymer will see.
  6. Finally, on line 13, we add the atom to the element’s state.

This is a lot of overhead for just one property. You can try to manage this more easily by just having one atom with all application state in some sort of map/object. This does mean you have fewer atoms to watch, but otherwise it doesn’t help too much.

The reasons for this are twofold:

  1. Now your template code has to reference a bunch of nested properties. What used to be {{property}} now looks like {{state.property}}.
  2. You must now use an observe block to get notifications of property changes, and you must specify each path through your state map.

In practice, this can look something like:

(def init-state
  {:foo 42
   :bar true
   :baz "Why am I not using Om/React?"})

(js/Polymer
  "my-element",
  #js {:state nil
       :stateAtom nil
       :create #(this-as me
                  (aset me "state" (clj->js init-state))
                  (let [state-atom (atom init-state)]
                    (add-watch property-atom ::to-polymer
                      (fn [_ _ _ _]
                        (let [clj (clj->js @state-atom)
                              js (aget me "state")]
                          (when (not= clj js)
                            (aset me "state" clj)))))
                    (aset me "stateAtom" state-atom)))
       :observe #js {:state.foo #(this-as me
                                   (swap! (aget "stateAtom" me) assoc :foo %2))
                     :state.bar #(this-as me
                                   (swap! (aget "stateAtom" me) assoc :bar %2))
                     :state.baz #(this-as me
                                   (swap! (aget "stateAtom" me) assoc :baz %2))}})

This is an awful lot of boilerplate. I suppose it may be possible to automate much of it using a macro, but the question that remains is what does this additional complexity buy you? In my opinion, though it may be a bitter pill to swallow, just accepting and using the mutability of Polymer element seems to be the most pragmatic route. Doing so allows things like Polymer's two-way data binding to just work.

Tip 5: Properly include the ClojureScript in development mode

There are three ways to register a Polymer element that has logic:

  1. Inside the element definition in the body of a script tag,
  2. Referencing an external script inside the element definition, and
  3. Loading the external script before the element definition.

With ClojureScript, we can choose from the latter two. Most of the time, the second form is the most convenient:

<polymer-element name="my-element">
  <template>…</template>
  <script type="text/javascript" src="my-element.js"></script>
</polymer-element>

However, this only works when one of the ClojueScript optimizations levels is used, i.e. whitespace, simple, or advanced. If you are in development mode, you would might be tempted to do something like:

<polymer-element name="my-element">
  <template>…</template>
  <script type="text/javascript" src="out/goog/base.js"></script>
  <script type="text/javascript" src="my-element.js"></script>
  <script type="text/javascript">
    goog.require('my_element');
  </script>
</polymer-element>

However, if you are importing your element, e.g. <link rel="import" href="my-element.html">, you will run into problems with the Google Closure library. It doesn’t like this since it will attempt to inject your script’s dependencies after the main page has loaded. Instead, while you are in development mode, place all of your script tags in your main HTML page.

Tip 6: Be wary of processing the component HTML

This isn’t ClojureScript-specific, but I thought I’d include it nonetheless. During development, I use some Ring middleware that injects script tags into the served HTML to connect to my browser REPL environment. This generally works well, but sometimes I saw some really bizarre behaviour. For example a paper-input component refused to take focus, or a core-submenu component hid its icon.

It turns out that Polymer uses a conditional attribute syntax:

<span hidden?="{{isHidden}}">You can hide me!</span>

If the expression is false, then the attribute will be omitted from the markup. Some HTML parsers such as Enlive or htmlmin cannot process that. At least in the case of htmlmin, it caused a mysterious hang. I received no warning at all from Enlive.

The bottom line is: don’t try to process a component’s HTML unless you are sure your processor can handle it.

Why bother with Polymer?

So given the existence of ClojureScript-friendly frameworks like Om/React, why bother trying to write Polymer elements in ClojureScript? That’s a great question. Here’s my take on it:

  1. The two are not mutually exclusive. You can easily use both Om/React components and Polymer elements on a page.
  2. However, nesting them in each other proves a lot more tricky. I don’t think there is any reason why you couldn’t embed Om/React component inside of a Polymer element. Unfortunately, I am not sure it’s possible to go the other way around. As best as I can tell, React expects to be React all the way down to the DOM.
  3. With the core and paper elements libraries, Polymer offers a compelling case for using it. As I am not a web front-end developer, the ability to easily use nicely-styled widgets and declaratively define animations is particularly nice. I am not an HTML/CSS/JavaScript wizard, and it would take me a long time to implement what Polymer provides. Using Polymer, I can instead spend my time working on my application.
  4. Frankly, I don’t think that ClojureScript is the ideal fit for Polymer. If you need to use ClojureScript, as in my case where I needed to use core.async, it’s certainly an option. However, if ClojureScript isn’t absolutely necessary, consider sticking to JavaScript. In the end, it’s all about choosing the right tool for the job.

Wednesday, 05 February 2014

Dart vs. ClojureScript: two weeks later

A couple of weeks ago, I wrote about my first impressions of the Dart programming language in preparation for GDG Houston’s Dart Flight School event coming up on the 22nd of February. Since then, I have finished the new code lab, the Dart tutorials, and the AngularDart tutorial. For comparison’s sake, I did all but the AngularDart tutorial in both Dart and ClojureScript to get a feel for the differences between the two languages. I have published my ClojureScript implementation of the Dart tutorials on Github.

After working my way through all of this code, what’s my take on Dart and ClojureScript? I’ll start with addressing errors from my previous post and then compare Dart and ClojureScript in the following areas:

  • Standard libraries
  • Ecosystem
  • Tooling
  • Debugging
  • Documentation
  • Outside the browser
  • Integrating with Polymer
  • Integrating with Angular
  • Asynchronous programming support

Errata from ‘First impressions’

Before I get to the comparisons, I would like to correct some things I got wrong last time.

Static typing

Dart supports a mix of dynamic and static typing. You can program in Dart without ever declaring a type. However, without the static types, static analysis tools available for Dart will be less effective. Nonetheless, it is a choice you get to make. For example, take the following program:

void main() {
  var noType = “foo”;
  noType = 2;

  String withType = “foo”;
  withType = 2;

  print(“$noType $withType”);
}

An IDE or the dartanalyzer program will flag line 6 above and give you a warning that A value of type 'int' cannot be assigned to a variable of type 'String'. Nonetheless, the program will run just fine and output 2 2. However, running the program with ‘checked mode’ enabled (either as a runtime option to the Dart VM or a compile-time option when compiling to JavaScript) will produce an exception at line 6 with a message akin to type 'int' is not a subtype of type 'String' of 'withType'.

There is one place where Dart’s type system does irk me: only false is false and only true is true. In the following program, all of the print statements will print that the value is false (in unchecked mode):

bool isTrue(var v) {
  if (v) {
    return true;
  } else {
    return false;
  }
}

void main() {
  print(“0 is ${isTrue(0)}“);
  print(“1 is ${isTrue(1)}“);
  print(“empty string is ${isTrue(")}“);
  print(“non-empty string is ${isTrue(“foo”)}“);
  print(“null is ${isTrue(null)}“);
}

In this case, the static analyser does not find any problems with the program, but running it in checked mode will produce a type error since the expression for the if condition must be a boolean type.

I have grown used to things like null pointers or zero being false and non-null pointer and non-zero integers being true. I find needing to explicitly make an equality check annoying.

Clojure has core.typed, a library to add gradual typing to Clojure programs. However, using it is not nearly as seamless as is choosing to use static typing in Dart.

Serialisation

This is one area where I got a lot of feedback last time. First, a few points:

  • It is idiomatic Dart to serialise to JSON.
  • Dart’s JSON library can automatically handle num, String, bool, and Null types. List and Map objects can also be automatically serialised subject to a few constraints.
  • Dart also has a serialistion library that serialises objects reflectively or non-reflectively. It is fairly powerful and highly customisable.

What’s the fallout from the above with respect to ClojureScript? I have a couple of thoughts:

  1. ClojureScripts’s extensible data notation (EDN) is richer than JSON, making it deal with complex data structures and retain semantic information. For example, is a list of items intended to have constant random access (a vector) or contain unique elements (a set)? Additionally, it is extensible and allows you to add support for application-specific types.
  2. ClojureScript’s data-centric approach (using the built-in data structures rather creating new types) makes serialisation very easy. If you follow the same approach in Dart, you can enjoy many of the same benefits. However, as soon as you introduce new types in either language, the situation becomes more difficult.

In conclusion, it seems like if you stick with built-in types, both languages have a comparable serialisation story. Nonetheless, I think that the idiomatic Clojure approach to data combined with the richness of EDN gives it an edge over Dart.

Dart vs. ClojureScript

Now that I have spent some more time with both languages, I can make more informed and helpful comparisons between the two. One thing to keep in mind is that these experiences come from going through the Dart tutorials, so they may not play to ClojureScript’s strengths.

Standard library

Dart takes a ‘batteries-included’ approach to its standard library, making it easy to write both web and command-line applications without depending on external libraries.

In comparison, ClojureScript is very much a hosted language. While it has superior support for functional programming and manipulating data structures, when the time comes to move on from solving koans to writing a real application, you discover you have to learn your target platform and how to interoperate with it.

I think of it this way:

Developer: Hi, I want to write a web application.

ClojureScript: Great!

Developer: Um… how do I get a reference to a DOM element?

ClojureScript: That depends.

Developer: On what?

ClojureScript: Well, are you going to use JavaScript directly, use Google’s Closure library, or try a ClojureScript library like Domina or Enfocus? There are also radical alternatives to manipulating the DOM like Pedestal and Om.

Developer: Uh… I don’t know.

Developer spends the next half-day evaluating the ClojureScript options.

Some days later:

Developer: Well, now I need to do something different. I need to use HTML5 IndexedDB.

ClojureScript: Great!

Developer: Is there a nice library for that?

ClojureScript: Sorry, you’ll need to stick to JavaScript or Google Closure. I hope you love callbacks and have brushed up on your interop skills.

Developer groans.

Even later:

Developer: Now I’d like to write a command line utility. I love Clojure, but its overhead is just too big. Can I use ClojureScript?

ClojureScript: Absolutely!

Developer: Great. How do I get started?

ClojureScript: Well, all you need to do is learn Node.js. You’ll find your interop and callback management skills handy.

Developer: I don’t suppose there are any ClojureScript libraries that will make this much easier?

ClojureScript: Nope, you’re on the wild frontier. There are some nice Node modules, though.

Developer considers using Python instead.

Ecosystem

Of course, there is a world beyond the standard library. I can’t account for the quality of libraries, but I can comment on the quantity:

Repository Number
Clojars (Clojure)
 libraries with a dependency on ClojureScript 188
 total number of libraries 8,270
CPAN (Perl) 129,130
Maven Central (Java) 70,518
NPM (JavaScript/Node.js) 57,443
Pub (Dart) 690
PyPI (Python) 39,573
RubyGems.org (Ruby) 69,863

Both ClojureScript and Dart have far fewer libraries than other, more established, languages. It seems that Dart does have more native libraries available than ClojureScript, and both can take advantage of JavaScript libraries (like those from NPM). It’s hard to tell which language has true edge in terms of ecosystem.

Tooling

Dart ships an all-in-one package, Dart Editor, that includes:

  • The Dart SDK,
  • A specialised build of Eclipse specialised for Dart (the Dart Editor), and
  • Dartium, a special build of Chrome that includes the Dart VM.

Additionally, there is Dart support as plug-ins for:

  • IntelliJ IDEA and WebStorm
  • Eclipse
  • Emacs
  • Sublime Text 2
  • Vim

I tried the IntelliJ IDEA plug-in, and it seems to be largely on par with Dart Editor, including features like static analysis, code completion, refactoring, and debugging. I also tried the Vim plug-in, but all it does is syntax highlighting.

I believe the Eclipse plug-in is the same as what is bundled with Dart Editor. I cannot speak for the Emacs or Sublime Text 2 support for Dart.

All in all, the tooling story for Dart is pretty solid. For a beginner, there is one download that contains everything to get started. For experienced developers, there is a good chance there is some level of Dart support in their preferred tools.

ClojureScript shares much of the same tooling story as Clojure. I’m not sure what the state of getting started on Clojure is these days, but it seems like Light Table is quickly becoming a popular recommendation.

As an experienced Clojure developer with an established Clojure working toolset, I still find working with ClojureScript more difficult than it ought to be. vim-fireplace supports ClojureScript, but I could never get a configuration that gave me an entirely satisfactory REPL experience. Even when I did manage to connect to the browser, it didn’t seem like my changes were having an effect. Also, features like documentation lookup no longer worked. I’ll accept that this may all be my own fault, but in the end I was back in an edit/compile/reload loop that at times seemed painfully slow (up to about 30 seconds for a recompile).

I have used Light Table with the ClojureScript and the Om tutorials. Undoubtedly, having the instant feedback from using the REPL makes development a much more efficient and enjoyable experience.

Debugging

Although this falls into tooling, I thought I’d draw special attention to debugging. As I mentioned earlier, you can use the debugger in Dart Editor, Eclipse, IDEA, or WebStorm together with Dartium and get a pretty good experience. Dart also prepares source maps when compiling to JavaScript, easing debugging on the browser.

One common complaint about Clojure is its poor error messages. I have never felt that was the case, but I came to Clojure with a lot of JVM experience. I think that ClojureScript definitely lends credence to the notion. It’s possible to enable source map support for ClojureScript, and it helps. However, that also significantly slows down compilation speed. Given how difficult it is to figure out what actually went wrong (often just a mistyped name) from a JavaScript stack trace, I began to really appreciate the static analysis support for Dart.

Documentation

Dart has excellent documentation. There are many tutorials, articles, code labs, and examples all on-line and accessible from Dart’s home page. Many of the tutorials also don’t presume a lot of specific development experience. As an example, this is very helpful to an experienced back-end developer who is just getting started with developing web applications for the browser.

There are good resources for ClojureScript, but they are spread about the web, primarily in different people’s blog posts. The wiki on ClojureScript’s GitHub has some good links, but I found most of my resources through web searches. Additionally, many resources presume that you’re already familiar with the underlying platform, making it just a bit harder to understand what’s going on and how to get started.

Outside the browser

One of the selling points of Dart is that in addition to being able to create browser-based web applications, you can write the server in the same language. This is great; you can reuse some of the same code seamlessly in both the client and the server. Additionally, it is possible to write command line utilities and even interact with native libraries written in languages like C or C++.

This is also possible with ClojureScript. While it is possible to write ClojureScript servers that run atop Node.js, it is much more common to write the server side in Clojure. As with Dart, the primary benefit of this arrangement is the ability to share code between the server and the client. Additionally, the fact that both ends can easily speak EDN to each other helps. The trickiest part of this combination is that there are subtle differences between Clojure and ClojureScript, and you have to be careful to keep your cross-language libraries within the intersection of the two.

Integrating with Polymer

Polymer is a library built by Google on top of Web Components designed to make it easy to create and reuse bits of functionality in web pages using custom elements. It is largely JavaScript-based, but there is a port of it to Dart, Polymer.dart. I don’t have any previous experience with Polymer, but I got a nice taste of it working through the tutorials.

Working with Polymer.dart was a relatively easy experience. It, along with Polymer itself, is still in a pre-release state. It’s not quite feature-complete compared to Polymer itself, but seemed pretty solid as a whole. I felt the trickiest part of using Polymer.dart was ensuring that the two-way data binding on nested data structures worked well.

There is no equivalent library for Polymer in ClojureScript, so it’s necessary to use ClojureScript’s interop with Polymer. As a result, you can do it, but getting the data binding to work with ClojureScript is an absolute pain. A good library would go far in making working with Polymer more palatable.

Integrating with Angular

AngularJS is an MVC framework for building large web applications, and AngularDart is said to be the future of the framework. AngularDart is not a strict 1:1 port of AngularJS. It works a bit differently and has been said to be ‘Angular reimagined’.

This is my first exposure to Angular of any flavour, and my impression is that it is a fairly neat framework. I enjoyed working my way through the AngularDart tutorial, but it is clear that it is still a pre-1.0 product. It’s not so much that the library is buggy, but the developer documentation is lacking compared to other parts of the Dart ecosystem.

I have not tried much Angular with ClojureScript; I simply haven’t had the time. There are multiple efforts to make Angular and ClojureScript work better together. Given the popularity of AngularJS, I wouldn’t be surprised if a good AngularCLJS library comes about.

In conclusion, it’s still early for ports of Angular to other languages. However, given that Google is behind AngularDart and pushing it forward, I expect it to mature much more quickly.

Asynchronous programming

Dart has a couple of key features for facilitating asynchronous programming:

  1. Futures, a way of performing work asynchronously; and
  2. Streams, a way of asynchronously managing series of events.

Dart’s futures are quite similar to Clojure’s, though a bit richer. For example, there is built-in support for chaining futures and for propagating errors through a chain of futures. Dart’s streams provide a way of acting on a series of events, such as getting data from a socket or reacting to UI events. Both of these features help ameliorate the ‘callback hell’ problem that’s associated with JavaScript.

In comparison, ClojureScript has no native support for either one of these mechanisms. However, there is core.async, a powerful Clojure and ClojureScript library for asynchronous programming. With it, it is possible to write highly asynchronous code in a fashion that reads as if it were synchronous. This makes the code significantly easier to reason about. David Nolen has written a good introductory article about the power of core.async. The main downside to core.async I have run into is that it makes debugging more difficult due to the immense transformation of the code at compile time.

In the end, while I think Dart’s approach to handling asynchronous programming is fairly decent, it doesn’t have the power of core.async.

Final thoughts

Dart was designed to be a language that is easy to learn and can scale to large projects, and I think it has accomplished that goal. If someone with a background in a language like Java or C++ asked me about a language for developing web applications, I would definitely recommend that they consider Dart. With Dart, as with ClojureScript, it is possible to write both the client and the server in the same language reusing the same code. In fact, it’s probably easier in Dart than a hybrid Clojure/ClojureScript application.

Does this mean I think Dart is better than ClojureScript? In a word, no. I would still recommend ClojureScript to Lisp aficionados and adventurous programmers. Most importantly, I believe ClojureScript’s Lisp roots make it a playground for innovation. I do not think something like core.async’s go macro is possible in a language like Dart. With a working browser REPL, ClojureScript should have the same highly-interactive development experience that Clojure provides, and that makes programming a much more enjoyable and productive experience.

In the end both Dart and ClojureScript are great languages. Dart is probably the more ‘practical’ of the two, and certainly the easiest to pick up. However, ClojureScript is more powerful and, in my opinion, fun.

Sunday, 26 January 2014

Book review: Boost C++ Application Development Cookbook

Although I write a lot about Clojure, I use C++ and Boost for the majority of my work. As such, when Packt offered me a review copy of Boost C++ Application Development Cookbook (sample chapter), I gladly took them up on the offer.

About the book

As you may infer from the word ‘cookbook’ in the title, this is not a comprehensive Boost reference or a book you expect to read from cover to cover. It consists of scores of recipes, all of which follow the same formula:

  1. An introduction the problem to solve.
  2. A brief statement of the prerequisite knowledge for the solution.
  3. A step-by-step walkthrough on how to solve the problem.
  4. A brief explanation as to how/why the solution works.
  5. Discussion about comparable/related functionality in C++11 or in other parts of Boost.
  6. Pointers to related recipes, external resources, and related Boost documentation.

These recipes are all grouped together into chapters, with each chapter having a general topic such as resource management or multithreading.

The intended audience for this book is experienced C++ developers who may not be familiar with all of Boost’s functionality and how compares with C++11.

Highlights

There are a lot things that I like about this book, in particular its emphasis on C++11 and the way it introduces a topic and then gives you to resources for more in-depth learning.

Though it’s been out for a couple years now, C++11 is still a relatively new standard and it takes time for programmers and programs to adopt it. Many of the new library features in C++11 are based on Boost libraries, and some Boost libraries exists to help ‘backport’ new language features to older versions of C++. The Cookbook does a very good job of letting the reader know whether C++11 has the same or similar features as Boost and how they differ.

The other thing I really enjoyed about this book is how it gently introduces the reader to Boost. There are a lot of Boost libraries, and the quality of the official documentation varies from very good to cryptic. In the past, I have avoided some libraries simply because I could never figure out how to even get started using them. This book can make some of these more accessible by giving me a simple example from which I can get a toehold. From that point, I can start to make sense of the documentation.

Room for improvement

No book is perfect, and there are a couple of ways in which this book could be more useful:

  1. It is not exhaustive. Granted, the number of Boost libraries is enormous, and some of them have limited applicability. Fortunately, the Cookbook does a good job of covering the most useful libraries.
  2. It only barely touches on the situation when you have to do deal with different Boost versions. Some Boost libraries have source and ABI incompatibilities between versions, and it can sometimes be a bit of a nightmare to write code that has to support different versions of Boost. It would have been nice to see if the author had any insights on how to handle that issue.

Concluding thoughts

Boost C++ Application Development Cookbook is definitely worth considering if you are a C++ developer that uses or would like to use Boost. It’s a good reference to have handy when you find yourself in a situation where you think: 'There has got to be a library for this.’

Friday, 24 January 2014

How Clojure works: more on namespace metadata

In the post How Clojure works: namespace metadata, I commented on how the metadata seemed to be missing from the following macro-expansion of ns:

(ns greeter.hello
  "A simple namespace, worth decompiling."
  {:author "Daniel Solano Gómez"})

; macro-expands once to (with some cleanup):
(do
  (in-ns 'greeter.hello)
  (with-loading-context (refer 'clojure.core))
  (if (.equals 'greeter.hello 'clojure.core)
    nil
    (do
      (dosync 
        (commute @#'*loaded-libs* 
                 conj 
                 'greeter.hello)) 
      nil)))

*print-meta*

Stuart Sierra pointed out in a comment that if we set *print-meta* to true, the metadata actually shows up three times in the macro-expansion:

(do
  (in-ns (quote ^{:author "Daniel Solano Gómez",
                  :doc "A simple namespace, worth decompiling."}
                greeter.hello))
  (with-loading-context (refer 'clojure.core))
  (if (.equals (quote ^{:author "Daniel Solano Gómez",
                        :doc "A simple namespace, worth decompiling."}
                      greeter.hello)
               'clojure.core)
    nil
    (do
      (dosync (commute @#'*loaded-libs*
                       conj
                       (quote ^{:author "Daniel Solano Gómez",
                                :doc "A simple namespace, worth decompiling."}
                              greeter.hello)))
      nil)))

A couple of observations:

  1. While it was obvious that the Clojure wasn't losing the metadata, now we can actually see how it gets processed.

  2. Even though the metadata is expanded three separate times, it only shows up twice in the compiled result. Apparently, when compiling a particular class, the compiler keeps tracks of what symbols are being used and deduplicates them.

This last point got me thinking: how sensitive is the compiler to the symbols and metadata it encounters?

Modifying the macro-expansion

To find out how sensitive the compiler is to symbols and their metadata, we can replace the ns form above with its macro-expansion and modify it just slightly:

(do
  (in-ns (quote ^{:author "Daniel Solano Gómez",
                  :doc "A simple namespace, worth decompiling."}
                greeter.hello))
  (clojure.core/with-loading-context (clojure.core/refer 'clojure.core))
  (if (.equals 'greeter.hello 'clojure.core)
    nil        
    (do               
      (dosync (commute @#'clojure.core/*loaded-libs*
                       conj
                       (quote ^{:author "Daniel Solano Gómez",
                                :doc "A simple namespace, worth decompiling."}
                              greeter.hello)))
      nil)))                  

The only change is in the comparison on line 6 where we have removed the metadata from the greeter.hello symbol. Functionally, this has no effect as metadata doesn't affect equality. However, does this change the generated code?

Examining the impact

As a matter of fact, it does. We have been careful so that the change has only affected the greeter.hello__init class. Just looking at the class signature, we can see this change made an impact:

package greeter;

import clojure.lang.*;

public class hello__init {
  public static final Var const__0;
  public static final AFn const__1;
  public static final AFn const__2;
  public static final AFn const__3;

  static {}

  public static void load();
  public static void __init0();
}

There is now an additional class AFn constant. When we see the decompiled __init0 method, we can see exactly what has changed:

public static void __init0() {
  const__0 = (Var)RT.var("clojure.core", "in-ns");
  IObj iobj = (IObj)Symbol.intern(null, "greeter.hello");
  Object[] meta = new Object[4];
  meta[0] = RT.keyword(null, "doc");
  meta[1] = "A simple namespace, worth decompiling";
  meta[2] = RT.keyword(null, "author");
  meta[3] = "Daniel Solano Gómez";
  IPersistentMap metaMap = (IPersistentMap)RT.map(meta);
  const__1 = (AFn)iobj.withMeta(metaMap);
  const__2 = (AFn)Symbol.intern(null, "greeter.hello");
  const__3 = (AFn)Symbol.intern(null, "clojure.core");
}

These changes include:

  • On lines 5-8, the order of the map metadata was changed. I don't think it's a significant change, but it is a change nonetheless;
  • On line 11, const__2 now holds a version of the symbol greeter.hello without the metadata; and
  • On line 12, const__3 holds the reference to clojure.core, which used to be in const__2.

When we examine the decompiled output of load(), we see that, as expected, the version of the greeter.hello symbol that has the metadata is used for the in-ns call and the version without the metadata is used in the comparison to clojure.core:

public static void load() {
  // (in-ns 'greeter.hello)
  IFn inNs = (IFn)const__0.getRawRoot();
  inNs.invoke(const__1); // version with metadata

  // (with-loading-context (refer 'clojure.core))
  IFn loading4910auto = (IFn)new greeter.hello$loading__4910__auto();
  loading4910auto.invoke();

  // (if (.equals 'greeter.hello 'clojure.core)
  //   nil
  //   (do
  //     (LockingTransaction/runIntransaction (fn* …))
  //     nil))
  Symbol greeterHello = (Symbol)const__2; // version without metadata
  if (greeterHello.equals(const__3)) {
    return null;
  } else {
    Callable callable = (Callable)new greeter.hello$fn__17();
    LockingTransaction.runInTransaction(callable);
    return null;
  }
}

Closing thoughts

First, a big thanks to Stuart Sierra for the tip about *print-meta*, it has been really helpful.

Second, examining the Compiler source code, it becomes a bit clearer what's going on: as the compiler encounters constants, it stores them in a vector to be emitted later. Additionally, it ensures that constants are not duplicated by using an IdentityHashMap, which relies on identity rather than equality. As such, we can see how the two symbols (with and without metadata) would be considered different.

However, what's not entirely clear is how the compiler knows that in the original macro-expansion that the two symbols with metadata are identical. I spent some time studying the compiler source, but it's somewhat hard to follow. I could probably use a debugger to trace its execution, but that's an exercise for another day.

Monday, 20 January 2014

How Clojure works: namespace metadata

In the first How Clojure works post, we examined how a Clojure namespace bootstraps itself. In particular, we saw how beguiling the following program can be.

(ns greeter.hello)

This program actually ends up creating three classes, including two anonymous function classes, each with a static initializer and a handful of constants.

Although I had promised looking at how a def works, I'd like to first add a bit more to our namespace declaration. Let's add some metadata:

(ns greeter.hello
  "A simple namespace, worth decompiling."
  {:author "Daniel Solano Gómez"})

We have added a namespace docstring as well as an attribute map that will be added to the namespace metadata. What do you think will be the result?

Anticipating the changes

Last time, we saw that ns is a macro that actually does quite a bit. So, let's expand it once (and clean up the result so that it can be read):

(do
  (in-ns 'greeter.hello)
  (with-loading-context (refer 'clojure.core))
  (if (.equals 'greeter.hello 'clojure.core)
    nil
    (do
      (dosync 
        (commute @#'clojure.core/*loaded-libs* 
                 conj 
                 'greeter.hello)) 
      nil)))

Well, that's interesting. It's not any different than what we had before. Where did the metadata go? Is it possible that it's all lost? That's not likely. Further macro-expansion won't help, so let's start decompiling.

Edit: As we see in the follow-up entry, using *print-meta* allows us to see the metadata.

Decompilation overview

When we look at the list of generated classes, we find the same three generated classes as before¹:

  • greeter.hello__init
  • greeter.hello$fn__17
  • greeter.hello$loading__4910__auto__

We are still not seeing anything new, so it's time to break out the decompiler and see what's going on at a deeper level. Let's start with the namespace class, greeter.hello__init.

greeter.hello__init

The class signature of greeter.hello__init hasn't changed:

package greeter;
import clojure.lang.*;

public class hello__init {
  public static {};

  public static final Var const__0;
  public static final AFn const__1;
  public static final AFn const__2;

  public static void load();
  public static void __init0();
}

However, if we examine the decompiled code, we find some changes to the __init0 method, so let's take a closer look at that.

__init0()

Examining the new content of the __init0 method, we begin to see what's going on:

static void __init0() {
  const__0 = (Var)RT.var("clojure.core", "in-ns");
  IObj iobj = (IObj)Symbol.intern(null, "greeter.hello");
  Object[] meta = new Object[4];
  meta[0] = RT.keyword(null, "author");
  meta[1] = "Daniel Solano Gómez";
  meta[2] = RT.keyword(null, "doc");
  meta[3] = "A simple namespace, worth decompiling";
  IPersistentMap metaMap = (IPersistentMap)RT.map(meta);
  const__1 = (AFn)iobj.withMeta(metaMap);
  const__2 = (AFn)Symbol.intern(null, "clojure.core");
}

As before, const__0 refers to the clojure.core/in-ns var and const__2 refers to the clojure.core symbol. The big difference here is that Clojure is no longer storing the greeter.hello symbol it creates. Instead, it creates that symbol, 'adds'² the metadata to the symbol, and stores the result in const__1.

This explains, to some extent, where the metadata went. It has been preserved by the compiler, but how can the Clojure runtime access the metadata? The greeter.hello__init class doesn't implement IMeta. It seems unlikely that the runtime would scour the class constants of loaded namespace classes looking for metadata.

Clearly, there is more to investigate. Let's take a look at the greeter.hello$loading__4910__auto__ class next.

greeter.hello$loading__4910__auto__

This is the class that implements (with-loading-context (refer 'clojure.core)). It hasn't changed as a result of the new metadata, so let's move onto the last generated class.

hello$fn__17

This is the anonymous function class that registers the namespace with Clojure. It effectively implements the following Clojure code:

(commute @#'clojure.core/*loaded-libs* 
         conj 
         'greeter.hello)

Decompiling the class, we see that it hasn't changed much. As with the greeter.hello__init, the class signature is identical. In this case, the implementation of the static initialiser differs:

static {
  const__0 = (Var)RT.var("clojure.core", "commute");
  const__1 = (Var)RT.var("clojure.core", "deref");
  const__2 = (Var)RT.var("clojure.core", "*loaded-libs*");
  const__3 = (Var)RT.var("clojure.core", "conj");
  IObj iobj = (IObj)Symbol.intern(null, "greeter.hello");
  Object[] meta = new Object[4];
  meta[0] = RT.keyword(null, "author");
  meta[1] = "Daniel Solano Gómez";
  meta[2] = RT.keyword(null, "doc");
  meta[3] = "A simple namespace, worth decompiling";
  IPersistentMap metaMap = (IPersistentMap)RT.map(meta);
  const__4 = (AFn)iobj.withMeta(metaMap);
}

As before, the first four class constants refer to the vars for clojure.core/commute, clojure.core/deref, clojure.core/*loaded-libs*, and clojure.core/conj. For the fifth class constant, instead of storing the symbol greeter.hello directly, it adds the metadata to the symbol before storing it in the class constant. So what are the consequences of this?

Well, when invoke() is called on this anonymous function, it ensures that clojure.core/*loaded-libs* will contain the symbol that contains the metadata. So, this must be where the namespace metadata comes from, right?

Digging deeper

At this point in my investigation, I was a little bit confused. At first, I thought that the namespace metadata must come from the *loaded-libs* var, but that's just a ref to a sorted set of symbols. However, if I want to get the metadata from a namespace at the REPL, I use (meta (find-ns 'greeter.hello)), and the type of the object returned by find-ns is a Namespace instance, not a Symbol. This got me thinking: what is the purpose of *loaded-libs* and where is the Namespace instance created?

The purpose of *loaded-libs*

*loaded-libs* is a private var declared in core.clj. You can get its content, a sorted set of symbols, via the loaded-libs function. It is used indirectly by require and use to keep track of what namespaces have been loaded. For example, when you use require without :reload or :reload-all, the presence of the namespace name symbol in *loaded-libs* will keep the namespace from being reloaded.

When using :reload-all, Clojure uses an initially-empty, thread-local binding of *loaded-libs*. This allows all dependencies of the desired library to be reloaded once, and the resulting set of loaded namespace name symbols is added to root binding of *loaded-libs*.

As a result, this means that the metadata used for the *loaded-libs* is not the metadata we get from the namespace object. For that, we'll have to take a closer look at the metadata attached to the symbol at greeter.hello__init/const__1.

Another look at greeter.hello__init/load

Looking back at greeter.hello__init, the namespace name symbol with metadata is stored in a class constant, const__1. The only place where this constant is used is in the load() method, which is decompiled as follows:

public static void load() {
  // (in-ns 'greeter.hello)
  IFn inNs = (IFn)const__0.getRawRoot();
  inNs.invoke(const__1);

  // (with-loading-context (refer 'clojure.core))
  IFn loading4910auto = (IFn)new greeter.hello$loading__4910__auto();
  loading4910auto.invoke();

  // (if (.equals 'greeter.hello 'clojure.core)
  //   nil
  //   (do
  //     (LockingTransaction/runIntransaction (fn* …))
  //     nil))
  Symbol greeterHello = (Symbol)const__1;
  if (greeterHello.equals(const__2)) {
    return null;
  } else {
    Callable callable = (Callable)new greeter.hello$fn__17();
    LockingTransaction.runInTransaction(callable);
    return null;
  }
}

As we see here, there are two places where this constant is used:

  1. In lines 3-4, it is used as the argument for in-ns.
  2. In lines 15-16, it is used in a comparison to 'clojure.core.

In the second case, the metadata has no effect, but what of the first?

A closer look at in-ns

in-ns is a bit special. Unlike most of clojure.core, it is not defined in core.clj. Instead, it is constructed within RT.java. Its value is actually an anonymous AFn implementation also defined in RT.java. This implementation is fairly simple, and the noteworthy bit is that the symbol that is passed to in-ns is further passed to the static method clojure.lang.Namespace/findOrCreate.

Class diagram of clojure.lang.Namespace

Namespace contains a static member called namespaces, which is a map³ of namespace name symbols to Namespace object instances. When findOrCreate is called and there is no mapping for the symbol yet, a new Namespace instance is created and inserted into the map.

The Namespace class extends clojure.lang.AReference, which holds metadata and indirectly implements clojure.lang.IMeta. As such, the Namespace constructor uses the metadata from the namespace name symbol as its metadata.

At last, we now know how a namespace gets its metadata. Looking at the implementation of find-ns, we see that it just calls Namespace/find which merely does a lookup in the namespaces map.

Parting thoughts

  1. If the purpose of *loaded-libs* is primarily to keep track of what namespaces have been loaded, does it really need metadata? Metadata doesn't affect the equality of symbols. Arguably, adding metadata to symbols in *loaded-libs* is a waste of memory.
  2. One interesting finding is that at the very heart of Clojure is a bit of mutable state. Keeping track of loaded libraries uses Clojure's concurrency utilities and persistent data structures, but namespaces rely on a Java concurrent collection.
  3. All new namespaces are initialised with a default set of import mappings, mostly classes from java.lang. This has two main implications:
    1. The only thing special about classes from java.lang is that their mappings are hard-coded. If a new java.lang class were to be added to Java, it won't get imported by default until RT.java is updated with a new mapping.
    2. Imports are mappings of symbols to class objects, and there is not a separate set of mappings for Clojure vars.
  4. Since Clojure keeps track of what's loaded in two different places, it's possible mess up the environment in strange ways. In particular, remove-ns does not clear a symbol from *loaded-libs*, meaning that it would be possible to get Clojure into a state where it thinks a namespace is loaded when actually it is not.

Footnotes

  1. Note that the names of the anonymous function classes can be different each time you compile.
  2. Unsurprisingly, symbols are immutable and annotating one with metadata generates a new symbol.
  3. In particular, it is a ConcurrentHashMap, a concurrency-friendly implementation of a map from Java. It is not a persistent data structure, but it does concurrent reads and limited concurrent writes.