Code for several recent projects is available below. If you want to explore all the software we have developed, please check out our lab github here.
Evaluating Dimensionality Reduction
Our lab has developed a novel approach to quantifying how much distortion is introduced into data by dimensionality reduction algorithms. This method, which is called the “Average Jaccard Distance” (AJD) is described in our recent paper. You can find code to apply the AJD to your own problems of dimensionality reduction here.
Epsilon Network Analysis
Since dimensionality reduction tools can drastically distort high-dimensional data, it can be helpful to analyze this data directly. We recently developed code for applying a simple graph-theoretic analysis, which we term “epsilon networks,” to high-dimensional data. Application of this approach to single-cell genomics data, including single-cell RNA-sequencing (scRNA-seq) data, demonstrated that this data does not generally contain natural, distinct clusters that might correspond to different cell types. You can read about the method and our results in our recent paper. Code for performing this analysis on any dataset is available here.
Differentially Distributed Genes
We have developed a novel algorithm for feature selection in droplet-based scRNA-seq data. This approach, which we term DDGs, uses a simple statistical approach to identify genes whose variation in the data is unlikely to arise purely from technical noise. The code for identifying DDGs can be found here.
Modeling the Assembly of Stacked Rings
In addition to developing methods for data analysis, our lab also builds modeling frameworks for studying the self-assembly of macromolecular machines using mathematica models. We recently developed an approach for studying the assembly of stacked rings, which is described in this paper. You can find the code for performing these simulations and all the related analyses in that work here.