Incanter Development Roadmap

I want to discuss plans for future Incanter development, and I’m looking for volunteers interested in contributing to any of the following projects, as well as suggestions for other improvements.

My list of priorities, in no particular order:

1. Create new functions based on the Java libraries already included in Incanter. For instance, I would like to improve incanter.optimize by a) including the nonlinear optimization routines available in Parallel Colt b) writing new routines in Clojure, and c) improving the existing routines.

2. Integrate Parallel Colt’s sparse matrix support, as suggested by Mark Reid.

3. Expose more of the chart customizability of JFreeChart in the incanter.chart library, e.g. enabling annotations of categorical charts, allowing users to set the scale on axes, customizing colors, etc..

4. Create an incanter.viz library, consisting of Processing-based data visualizations.

5. Integrate the Weka machine learning library.

6. Provide additional statistical methods.

7. Optimize existing functions; in general, I have favored ease-of-use and expressibility over performance, but there is A LOT of room to optimize without compromising usability.

Any other suggestions or feedback is welcome, and should be directed to the Incanter Goolge Group, where I have started an “Incanter Roadmap” thread. You can subscribe to the mailing list here.

David

6 responses to “Incanter Development Roadmap

  1. Integrating the Weka ML library is an interesting idea but the algorithms in Weka are a little dated. I suspect that you will have to do a lot of converting back and forth between Weka’s data formats and Incanter’s.

    Some other Java ML libraries you may want to consider are Rapid Miner (formerly YALE) and Ling Pipe (though this is more for text processing).

    One other suggestion I have is you may want to make it easier to use sparse matrix features of Parallel Colt from Incanter. I started out wanting to implement a simple SGD algorithm on top of Incanter but found it was easier just to use Parallel Colt’s sparse matrix library directly.

    If you’re amenable, I’d be interested in adding some online and boosting algorithms to Incanter.

    • Mark,

      Thanks for the suggestions on Java ML libraries, your opinion on this carries a lot of weight.

      One other suggestion I have is you may want to make it easier to use sparse matrix features of Parallel Colt from Incanter.

      I’ll add that to the TODO list.

      If you’re amenable, I’d be interested in adding some online and boosting algorithms to Incanter.

      I am very amenable, and look forward to any and all contributions from you!

      -David

  2. Just to be clear: I have used Weka and am familiar with its strengths and short-comings but I have only read about Rapid Miner and Ling Pipe, not used them in any serious way.

    • Well, both Rapid Miner and Ling Pipe are certainly worth investigating. I noticed that Rapid Miner can be used under the AGPL license, which I need to investigate a bit to see how it compares to the LGPL used in a couple of the existing libraries.

  3. 3. Expose more of the chart customizability of JFreeChart in the incanter.chart library, e.g. enabling annotations of categorical charts, allowing users to set the scale on axes, customizing colors, etc..

    I am using incanter for outputting charts, and have added some functions to be able to set axis to logarithmic scale etc. Will drop them into the incanter group messageboard once I’m back from holiday.

    There is also Dejcartes, which is an interface to jfreechart and which might be useful to you.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s