Catch Up

9th November 2011

I've not posted for ages. So here is a summary of a bunch of stuff I've been looking at for fun.

Machine Learning

First up, after finishing the MIT Introduction to Algorithms lectures, I was excited to hear about Stanford's free computer science courses. They are full, taught and (machine) assessed university modules for free! I'm studying Machine Learning and I am really impressed with the quality of the teaching. Thanks Stanford.

There is of course speculation that this is a trial for a new paid remote service. To be honest I feel the quality of the course I've done would be worth paying for if they could find a way to acredit something as a real qualification without proper human assessment.

C++ Experiments

Following on from my experiments with LevelDB, I have played around with creating a C++ gossip implementation based on Cassandra's using ZeroMQ. I spent a lot of time getting a really basic grasp on the intricacies of threading vs event driven style + message passing etc. Ended up with multiple processes on same machine (different ports) gossiping and effectively sharing cluster state. Didn't get around to implementing the full phi accrual failure detection for machine/up down inference and I'm sure the code would need to be torn apart and re-written for anythign resembling real use, but a good learning exercise.

I've now moved on to fiddling about with on-disk data structures. So far I'm mostly just learning. I've read through the specs for SQLite's db file and some articles on CouchDB's Copy-on-write B-tree (not to mention LevelDB/Cassandra's LSM trees). I've also read Acuna's paper on Stratified B-Trees which is all really interesting stuff. Not quite sure what I want to implement now but I may start with trying to get a basic block and free-list allocator working. Just the experience of actually working with C++ and "real" algorithms is fascinating for me, a lowly PHP developer.

In summary then, I'm still doing loads of geeky computer stuff, just forgetting to write about any of it.

Paul Banks

Machine Learning

C++ Experiments