Software and data packages

Bioconductor

Software packages

  • R/spoon | | R/Bioconductor package that addresses the mean-variance relationship in spatially resolved transcriptomics data.
  • R/SpotSweeper | | R/Bioconductor package that provides spatially-aware quality control metrics for spatial transcriptomics data. (Totty et al., 2024. bioRxiv).
  • R/escheR | | Built off of ggplot2 and the Gestalt principles to visualize multi-dimensional data in the 2D space (e.g. embedding or spatial visualizations). (Guo et al., 2023. Bioinformatics Advances).
  • R/TREG | | R/Bioconductor package to find Total RNA Expression Genes (TREGs) in single nucleus RNA-seq data (Huuki-Myers et al., 2023. Genome Biology).
  • R/nnSVG | | R/Bioconductor package that provides a scalable method to identify spatially variable genes (SVGs) in spatially-resolved transcriptomics data based on nearest-neighbor Gaussian processes. (Weber et al., 2023. Nature Communications).
  • R/miQC | | R/Bioconductor package with a data-driven quality control metric to predict the low-quality scRNA-seq cells. (Hippen et al., 2021. PLOS Computational Biology).
  • R/SpatialExperiment | | R/Bioconductor package to define a S4 class for data with spatial coordinates by extending SingleCellExperiment to include spatial experiments such as seqFISH and 10x Visium spatial transcriptomics technologies. (Righelli, Weber, Crowell et al., 2021. Bioinformatics).
  • R/bluster | | R/Bioconductor package to wrap common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.
  • R/spQN | | R/Bioconductor package to implement spatial quantile normalization (SpQN). This method was developed to remove a mean-correlation relationship in correlation matrices built from gene expression data. (Wang et al., 2020. PLoS Computational Biology).
  • R/scry | | R/Bioconductor package to implement count-based feature selection and dimension reduction algorithms to facilitate unsupervised analysis of high-dimensional data, such as single-cell RNA-seq. This package builds around glmpca R CRAN package. (Townes et al., 2019. Genome Biology).
  • R/methylCC | | R/Bioconductor package to estimate the cell composition of whole blood in DNA methylation samples. (Hicks and Irizarry, 2019. Genome Biology).
  • R/mbkmeans | | R/Bioconductor package implementing the mini-batch optimization for k-means (mbkmeans)clustering proposed in Sculley (2010) for large datasets, including scRNA-seq data. The mini batch k-means algorithm can be run with data stored in-memory or on-disk (e.g. HDF5 file format). (Hicks et al., 2021. PLoS Computational Biology).
  • R/TreeSummarizedExperiment | | R/Bioconductor package to define a S4 class for data with hierarchical structure by extending SingleCellExperiment to include hierarchical information on the rows or columns of the rectangular data. (Huang et al., 2020. F1000).
  • R/qsmooth | | R/Bioconductor package available that implements a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions. (Hicks et al., 2018. Biostatistics).
  • R/quantro | | R/Bioconductor package to test for global differences between groups of distributions to decide when to use quantile normalization. (Hicks and Irizarry, 2015. Genome Biology).

Data packages

CRAN

Software packages

  • R/fasthplus | Provides fast approximations for metrics of discordance or dissimilarity including (1) to evaluate the discordance between two arbitrary sets or (2) to evaluate label fitness (clustering) for a generalized dissimilarity matrix. (Dyjack et al., 2022. Biostatistics).
  • R/OCSdata | Provides functions to access and download data from the Open Case Studies repositories on GitHub. Different functions enable users to grab the data they need at different sections in the case study, as well as download the whole case study repository.
  • R/glmpca | Implements a generalized principal components analysis (GLM-PCA) for dimension reduction of non-normally distributed data such as counts or binary matrices. (Townes et al., 2019. Genome Biology).

Python

Software packages

GitHub

Software packages

  • R/quantroSim | Supporting data simulation R-package for the quantro R-package to simulate gene expression and DNA methylation data.
  • R/explainr | Translates S3 objects into text using standard templates in a simple and convenient way.
  • postMUT | A tool implemented in Perl and R to predict the functionality of missense mutations.

Data packages

  • trapnell2014myoblasthuman | R data package that contains an ExpressionSet object from Trapnell et al. (2014) that performed a time-series experiment of bulk and single cell RNA-Seq at four time points in differentiated primary human myoblasts.
  • patel2014gliohuman | R data package that contains a SummarizedExpression object from Patel et al. (2014) with single cell and bulk RNA-Seq data on five human glioblastoma tumors.
  • colonCancerWGBS | Cov files produced from Bismark after mapping six paired tumor-normal WGBS samples from Ziller et al. (2013) PMID: 23925113. Only chr22.
  • myAffyData | AffyBatch object from an experiment using P493-6 cells expressing low or high levels of c-Myc. Data from Loven et al. (2012) Cell 151: 476-482.
  • BackgroundExperimentYeast: AffyBatch object from an experiment to measure NSB and optical noise in yeast.