ciscratch notes

I’ve been having fun today sitting on a heating pad for a bad lower back and learning a bit of statistics, python, clojure and Incanter by porting Toby Segaram’s “Collective Intelligence” to clojure. My code is on github.

I just thought I’d share a few of my initial thoughts after today’s futzing around.

1. I like the clojure set functions. It feels much clearer to me than iterating a set of keys to find a match. I would often write a helper function to do that or comment for clarity, but with clojure it is just (intersection coll1 coll2).

2. let is a wonderful thing when you have to string a lot of formulae together (or even when it isn’t a lot yet). It would look almost like perl if I did it all inline.

3. I love, love, love the combination of a good repl with easy to use unit tests. Explore in one and codify your understanding in the other (I’ll let you guess which way around that works).

4. I got caught out by operator precedence translating from the python to clojure. Even in Java I try to make the order of operations explicit. It was easy to see what was wrong once I found it, but it took me ages to actually find the bug I had.

5. I love being able to do a sum of products by doing (sum (map * ratings1 ratings2)) or (reduce + (map * ratings1 ratings2)). You can just keep adding collections to the map and it will keep multiplying.

6. I’m having fun. I’m enjoying teasing out the python code. I like how doing the translation forces me properly read the python and force me to think about how I’d do it in clojure.

3 thoughts on “ciscratch notes

  1. Good work Bruce. Nice idea about porting the examples to Clojure. You’ve reminded me I need to write-up by experiences of Clojure so far.

    1. That looks interesting and reminds me of some of the syntactic sugar I like in clojure (especially how he puts the data together). Having clojure.set makes things really clear for me too (though I understand that it makes it clear for *me* and not the world).

      Our Pearson Correlations look very similar. I’m glad I wasn’t the only one who thought it needed a gigantic let. He also has a divide by 0 error in his Pearson which I test for if there are no common films. I might still steal some ideas though. Thanks for the pointer.

Leave a reply to otfrom Cancel reply