Projects
An up to date list of all of my projects can be found on my github page.
General
- mod_auth_openid: An authentication module for the Apache 2 webserver. It handles the functions of an OpenID consumer as specified in the OpenID 1.1 specification.
- Twistar: An asynchronous Python ORM library that is built on Twisted.
- bandit: A multi-armed bandit based alternative to A/B testing in Rails.
- ganapati: A Ruby interface to Hadoop’s HDFS
- HiveSwarm: Helpful user defined fuctions / table generating functions for Hive
- ankusa: A Naive Bayes classifier in Ruby that uses Hadoop’s HBase for storage
- hbaserb: A Ruby thrift library for interactinb with Hadoop’s HBase
- houdah: Ruby lib for interacting with a Hadoop JobTracker / TaskTrackers
- abanalyzer: An A/B test analysis library for Ruby - performs Chi-Square tests and G-tests on categorical data
- Blobber: Given a webcam, projector, and a computer, the program tracks lights/colors with the camera and then project “reactions” onto a large surface with the projector (see link for video)
- StactiveRecord: A C++ library designed to make simple database use simple. It was inspired by Ruby on Rail’s Active Record. It uses an Object-relational mapping pattern to represent records as objects. It also provides persistent (basic) object relationships (one to many, many to many, one to one).
Academic Research Projects
All of my research so far has dealt with either text mining or network theory. My primary focus right now is a multilabel hierarchical document classification (not listed below). These are all projects I maintain in the Bioinformatics Division at MUSC.
- pymur: a Python interface to The Lemur Toolkit.
- GOGrapher (manuscript in preparation): GOGrapher is a python library that uses the Gene Ontology to create a network relating terms to each other and proteins to terms.
- GOSteiner: A project developing a general measure of protein-group functional coherence, based on the Gene Ontology and utilizing GOGrapher.