Category Archives: visualization

Reading and writing Excel (xls) files with Incanter

I have just added David James Humphreys’ incanter-excel module to the Incanter distribution, providing basic capabilities for reading Microsoft Excel spreadsheets in as Incanter datasets and saving datasets back out as xls files.

I have posted a simple spreadsheet of Australian airline passenger data from the 1950s to the Incanter website for the following example. The read-xls function can read xls files given either a filename or URL, so you won’t need to download the file.

Start by loading the necessary libraries, including incanter.excel.

(use '(incanter core charts excel))

Next, we can use the with-data macro to bind a dataset converted from the above xls file, using the read-xls function, and then view it.

(with-data (read-xls "http://incanter.org/data/aus-airline-passengers.xls")
  (view $data)

The read-xls function takes an optional argument called :sheet that takes either the name or index of the worksheet from the xls file to read (in this case either “dataset” or 0) , it defaults to 0.

[NOTE: A current weakness of read-xls is that cells containing formulae, as opposed to actual data, are not imported (i.e. the cells remain empty).]

Finally, we’ll create a time-series plot of the data. However, the time-series-plot needs time in milliseconds, so we’ll first create a function that converts the date column from Java Date objects to milliseconds, and then view the plot.

  (let [to-millis (fn [dates] (map #(.getTime %) dates))] 
    (view (time-series-plot (to-millis ($ :date)) ($ :passengers)))))

Datasets can also be saved as Excel files using the save-xls function. The following example just reads in one of the sample datasets using incanter.datasets/get-dataset and then saves it as an xls file.

(save-xls (get-dataset :cars) "/tmp/cars.xls")

The incanter-excel module is now included in the Incanter distribution on Github, and is available as a separate dependency from the Clojars repository. The complete code from this example can be found here.

New default theme and customization features for Incanter charts

I just updated Incanter‘s default chart theme. The new theme is inspired by Hadley Wickham‘s awesome ggplot2 package for R.

The first example is my usual “hello world” chart, a histogram of data sampled from a normal distribution.

(use '(incanter core charts stats datasets))

(view (histogram (sample-normal 1000)))

The next example is a scatter plot of the sepal-length vs. sepal-width from the built-in iris data set.

(view (scatter-plot :Sepal.Length :Sepal.Width :data (get-dataset :iris)))

In addition to changing the default theme, I have included new functions for customizing the appearance of the charts. Here’s an example of the set-stroke function, used here to change the color of the data points in the previous chart.

(doto (scatter-plot :Sepal.Length :Sepal.Width :data (get-dataset :iris))
  (set-stroke-color java.awt.Color/gray)
  view)

The next example uses the :group-by option to color the points based on their species.

(view (scatter-plot :Sepal.Length :Sepal.Width 
                    :group-by :Species 
                    :data (get-dataset :iris)))

This example uses function-plot to create an xy-plot of the sine and cosine functions.

(doto (function-plot sin -10 10)
  (add-function cos -10 10)
  view)

This example uses the $rollup and bar-chart functions to plot the data from the built-in hair-eye-color data set.

(with-data (->>  (get-dataset :hair-eye-color)
             ($rollup :sum :count [:hair :eye]))
  (view (bar-chart :hair :count :group-by :eye :legend true)))

This example uses the box-plot function to plot data from three gamma distributions.

(doto (box-plot (sample-gamma 1000 :shape 1 :rate 2)
                :legend true :y-label "")
  view 
  (add-box-plot (sample-gamma 1000 :shape 2 :rate 2))
  (add-box-plot (sample-gamma 1000 :shape 3 :rate 2)))

The following examples are based on the charts in figure 4.2 of chapter four of “The Joy of Clojure“, where the performance characteristics of Clojure’s data structures are discussed.

First define the two functions to plot, and the range of values to plot them over.

(defn log32 [x] (/ (log x) (log 32)))
(defn f1 [n] (plus (log2 n) (mult (log32 n) 5000)))
(defn f2 [n] n)

(def min-val 10)
(def max-val 40000)

Next, create the plot and use the set-stroke function to increase the stroke thickness for both lines, and make the second line dashed.

(def chart (doto (function-plot f1 min-val max-val 
                   :legend true 
                   :series-label "O(log2 n) + O(log32 n) * 5000"
                   :x-label ""
                   :y-label "")
             (add-function f2 min-val max-val 
                           :step-size 5000 
                           :series-label "O(n)") 
             (set-stroke :width 2)
             (set-stroke :width 2 :dataset 1 :dash 5)))

(view chart)

The three charts in the book are of the same data but each focuses on a different region. You can use the set-y-range and set-x-range functions to zoom-in on each of the different regions.

;; PLOT (A)
(doto chart
  (set-title "(A)")
  (set-x-range 100 5000)
  (set-y-range 30 12000))

;; PLOT (B)
(doto chart
  (set-title "(B)")
  (set-y-range 10000 16000)
  (set-x-range 10000 16000))

;; PLOT (C)
(doto chart
  (set-title "(C)")
  (set-y-range 0 30000)
  (set-x-range 0 30000))

The new theme is available in the latest version of Incanter on Clojars and Github, and the complete code for the above examples is available here.

Dynamic charts with Incanter

I had the opportunity to attend last week’s PragmaticStudio Clojure workshop taught by Rich Hickey and Stuart Halloway (I highly recommend it, and there are still seats open for the May class). During the three days I talked with Rich about features he’d like to see in Incanter, and the first thing he asked about was adding dynamic charts, like are available in Mathematica using the Manipulate function. So I ended up spending much of my lab time working on this feature, the first draft of which is now available.

Incanter has three new macros, sliders, dynamic-xy-plot, and dynamic-scatter-plot. The sliders macro can bind multiple named sequences to an equal number of JSlider widgets. When a slider is manipulated a user defined expression is evaluated. For instance, the following code will display two slider widgets bound to two sequences, x and y.

(sliders [x (range -3 3 0.01)
          y (range 0.01 10 0.1)]
  (foo x y))
  

Each time one of the sliders is manipulated the expression (foo x y) will be evaluated with the new value of either x or y. Presumably, foo has side effects, like changing the value of a ref or manipulating a GUI widget, since it is running in the separate thread used by the slider widget.

I then combined this macro with incanter.charts/set-data function to create dynamic versions of xy-plot and scatter-plot, named appropriately dynamic-xy-plot and dynamic-scatter-plot respectively.

The following example creates an xy-plot of a sequence of values named x versus the normal PDF of x, and displays two sliders bound to the mean and standard deviation of the PDF.

(let [x (range -3 3 0.1)]
  (view (dynamic-xy-plot [mean (range -3 3 0.1)
                          std-dev (range 0.1 10 0.1)]
          [x (pdf-normal x :mean mean :sd std-dev)])))

The expression provided to dynamic-xy-plot must produce a sequence containing either two sequences with N values, or N sequences with two values each. In other words, a N-by-2 matrix or a 2-by-N matrix, where N is the number of points plotted. The expression above,

[x (pdf-normal x :mean mean :sd std-dev)]

produces a vector containing a sequence called x and the sequence produced by calling pdf-normal on x (equivalent to a N-by-2 matrix).

Manipulating the sliders will change the shape and position of the curve.

The dynamic-scatter-plot macro works the same way as dynamic-xy-plot. All three macros are available in the version of Incanter on Github and in repo.incanter.org.

Data Sorcery with Clojure & Incanter: Introduction to Datasets & Charts

I put together my slides (pdf) for next week’s National Capital Area Clojure Users Group February Meetup. Being snow-bound this week, I’ve been able to make more slides than I’ll have time to cover during next week’s session, so I’ll be skimming over some of the examples.

Russ Olsen will start the session with an introduction to Clojure, so if you’re in the D.C. area next Thursday (February 18), sign-up for the meetup.

The code used in this presentation is available here, and a more printer-friendly version of the presentation itself, with a white background, is available here.

Creating Processing Visualizations with Clojure and Incanter

The Processing language, created by Ben Fry and Casey Reas,
“is an open source programming language and environment for people who want to program images, animation, and interactions.”

Incanter now includes the incanter.processing library, a fork of Roland Sadowski‘s clj-processing library, making it possible to create Processing visualizations with Clojure and Incanter. Incanter.processing provides much more flexibility when creating customized data visualizations than incanter.charts — of course, it is also more complex to use.

Several nice examples of the kinds of visualizations that can be created in Processing can be found on Ben Fry’s website and blog, including the cool zipcode, human vs. chimps, baseball salary vs. performance examples.

The processing website has a set of tutorials, including one on getting started, and there are also three great books on Processing worth checking out:

The API documentation for incanter.processing is still a bit underdeveloped, but is perhaps adequate when combined with the excellent API reference on the Processing website.

Incanter.processing was forked from of Roland Sadowski’s clj-processing library in order to provide cleaner integration with Incanter. I have added a few functions, eliminated some, and changed the names of others. There were a few instances where I merged multiple functions (e.g. *-float, *-int) into a single one to more closely mimic the original Processing API; I incorporated a couple functions into Incanter’s existing multi-methods (e.g. save and view); I eliminated a few functions that duplicated existing Incanter functions and caused naming conflicts (e.g. sin, cos, tan, asin, acos, atan, etc); and I changed the function signatures of pretty much every function in clj-processing to require the explicit passing of the ‘sketch’ (PApplet) object, whereas the original library passes it implicitly by requiring that it is bound to a variable called *applet* in each method of the PApplet proxy.

These changes make it easier to use Processing within Incanter, but if you just want to write Processing applications in Clojure without all the dependencies of Incanter, then the original clj-processing library is the best choice.

A Simple Example

The following is sort of a “hello world” example that demonstrates the basics of creating an interactive Processing visualization (a.k.a sketch), including defining the sketch’s setup, draw, and mouseMoved methods and representing state in the sketch using closures and refs. This example is based on this one, found at John Resig‘s Processing-js website.


Click on the image above to see the live Processing-js version of the sketch.

Start by loading the incanter.core and incanter.processing libraries,

(use '(incanter core processing))

Now define some refs that will represent the state of the sketch object,

(let [radius (ref 50.0)
      X (ref nil)
      Y (ref nil)
      nX (ref nil)
      nY (ref nil)
      delay 16

The variable radius will provide the value of the circle’s radius; X and Y will indicate the location of the circle; and nX and nY will indicate the location of the mouse. We use refs for these values because their values are mutable and need to be available across multiple functions in the sketch object.

Now define a sketch object, which is just a proxied processing.core.PApplet, and its required setup method,

sktch (sketch
        (setup []
          (doto this
            (size 200 200)
            (stroke-weight 10)
            (framerate 15)
            smooth)
          (dosync
            (ref-set X (/ (width this) 2))
            (ref-set Y (/ (width this) 2))
            (ref-set nX @X)
            (ref-set nY @Y)))

The first part of the setup method sets the size of the sketch, the stroke weight to be used when drawing, the framerate of the animation, and indicates that anti-aliasing should be used. The next part of the method uses a dosync block and ref-set to set initial values for the refs. Note the @ syntax to dereference (access the values of) the refs X and Y.

Processing sketches that use animation require the definition of a draw method, which in this case will be invoked 15 times per second as specified by the framerate.

  (draw []
   (dosync
     (ref-set radius (+ @radius 
                        (sin (/ (frame-count this) 4))))
     (ref-set X (+ @X (/ (- @nX @X) delay)))
     (ref-set Y (+ @Y (/ (- @nY @Y) delay))))
   (doto this
     (background 125) ;; gray
     (fill 0 121 184)
     (stroke 255)
     (ellipse @X @Y @radius @radius)))

The first part of the draw method uses dosync and ref-set to set new values for the radius, X, and Y refs for each frame of the animation. The sin function is used to grow and shrink the radius over time. The location of the circle, as indicated by X and Y, is determined by the mouse location (nX and nY) with a delay factor.

The next part of the draw method draws the background (i.e. gray background) and the circle with the ellipse function.

Finally, we need to define the mouseMoved method in order to track the mouse location, using the mouse-x and mouse-y functions, and set the values of the nX and nY refs. All event functions in incanter.processing, including mouseMoved, require an event argument; this is due to limitations of Clojure’s proxy macro, and isn’t required when using the Processing’s Java API directly.

(mouseMoved [mouse-event]
  (dosync
    (ref-set nX (mouse-x mouse-event)) 
    (ref-set nY (mouse-y mouse-event)))))]

Now that the sketch is fully defined, use the view function to display it,

(view sktch :size [200 200]))

The complete code for this example can be found here.

In future posts I will walk through other examples of Processing visualizations, some of which can be found in the Incanter distribution under incanter/examples/processing.