NEW: The ChIPseeqer paper is out and highlighted as

Analysis of ChIP-seq data

Pathway Analysis


ChIPseeqer: a comprehensive framework for the analysis of ChIP-seq data.


iPAGE: Information-Theoretic Pathway Analysis of Gene Expression
(with H. Goodarzi and Saeed Tavazoie).

Motif discovery

FIRE: a universal framework for DNA and RNA motif discovery from expression data.

Fastcompare: whole-genome, alignment-free identification of conserved DNA and RNA motifs


FIRE-pro: protein motif discovery using
mutual information (with D. Lieber and S. Tavazoie).


DTscore / DTdraw : tandem gene duplication tree reconstruction and display

IMGT/PhyloGene : online creation of phylogenetic trees for immunoglobulin (IG) and T-cell receptor (TR) genes

More Software

  • C implementation of Iclust (Slonim et al, 2005, PNAS): Iclust.tar.gz (code provided by Noam Slonim)
  • (a C program) calculates the mutual information and associated statistical significance between columns of a motif alignment (collection of sites in AlignACE format).
  • A reimplementation in C of CompareACE (for comparing weight matrices) :
  • A reimplementation of ScanACE (for finding motif weight matrix matches), plus a script to sample sites from weight matrices :
  • An email-based web page monitoring script (I got very tired of checking up the same rarely-changing pages over and over):
  • Kohonen / Self-Organizing Map (SOM) clustering program (in C):
  • A C implementation of Dijkstra's shortest paths in a graph (Bellman-Ford algorithm in fact):
  • A very simple and flexible backup script based on rsync
  • A script for drawing motif maps (like this one):
  • A Perl module (interfacing some C code) for calculating hypergeometric p-values Hypergeom-0.01.tar
  • A simple Perl implementation of Cancer Outlier Profile Analysis (COPA), described in Tomlins et al, 2005: COPA.