DNA Modifications


anno

Table of Contents

DNA modification analysis

Whole genome DNA methylation detection is one of the most important part of epigenetics research. It is supposed to have a great effect on cancers and tumors, and even be involved in the senility of human. In addition, it is believed that in medical aspects, DNA methylation may have a strong relationship with diabetes and immunological diseases (Jeong et al., 2014; Hackett et al., 2013; Duthie, 2011).


DNA Methylation Prediction

T-BioInfo

Combines statistical analysis modules into pipelines to deal with heterogenous big data. T-BioInfo is an application that can be used for: (1) next-generation sequencing (NGS) data (transcriptomics, genomics/epigenetics, and DNA/RNA); (2) mass-spectroscopy; (3) structural biology; and (4) data integration and modeling (virology, data association, and data mining).

Official Website

Publications:

Institutions(s):

University of Haifa, Israel Pine Biotech, Haifa, Israel


MethSurv

Correlates overall survival with DNA methylation levels. MethSurv allows to investigate methylation biomarkers that associate with the survival of various human cancers. It combines unsupervised hierarchical clustering and principal component analysis (PCA) for any particular gene. This tool can give a graphical overview of methylation differences between the cancer patients as well as gene subregions.

Official Website

Publications:

Institutions(s):

Institute of Computer Science, University of Tartu, Tartu, Estonia; United Laboratories of Tartu University Hospital, Tartu University Hospital, Tartu, Estonia.


Nanopolish

Provides a nanopore consensus algorithm using a signal-level hidden Markov model (HMM). The main subprograms of Nanopolish are: (i) nanopolish extract which extracts reads in FASTA or FASTQ format from a directory of FAST5 files; (ii) nanopolish eventalign which aligns signal-level events to k-mers of a reference genome; (iii) nanopolish variants which detects single nucleotide polymorphisms (SNPs) and indels with respect to a reference genome; and (iv) nanopolish variants –consensus which calculates an improved consensus sequence for a draft genome assembly. Furthermore, Nanopolish contains an experimental option that will use event durations to improve the consensus accuracy around homopolymers.

Official Website

Publications:

Institutions(s):

Ontario Institute for Cancer Research, Toronto, Ontario, Canada; Department of Computer Science, University of Toronto, Toronto, Ontario, Canada; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA

Top


DNA Methylation Deconvolution

FaST-LMM-EWASher

An R version of FaST-LMM-EWASher, which performs epigenome-wide association analysis in the presence of confounders such as cell-type heterogeneity. A python version of this software is also available as part of Fast-LMM-Py.

Official Website

Publications:

Institutions(s):

eScience Research Group, Microsoft Research, Los Angeles, CA, USA; The Broad Institute of MIT and Harvard, Cambridge, MA, USA


ReFACTor | Reference-Free Adjustment for Cell-Type composition

A method based on principal component analysis (PCA) and designed for the correction of cell type heterogeneity in epigenome-wide association studies (EWAS). ReFACTor tool is based on a variant of PCA and can be applied to any tissue. It selects the sites that can be reconstructed with low error using a low-rank approximation of the original methylation matrix. Moreover, ReFACTor does not use the phenotype in the selection process, making ReFACTor useful as part of a quality control step in EWAS.

Official Website

Github

Documentation

Publications:

Institutions(s):

Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv, Israel; Department of Medicine, University of California, San Francisco, CA, USA


EDec | Epigenomic Deconvolution

Provides accurate platform-independent estimation of cell type proportions, DNA methylation profiles and gene expression profiles of constituent cell type. EDec enables deconvolution of complex tumor tissues where highly accurate reference are enables. EDec reveals layers of biological information about distinct cell types within solid tumors and about their heterotypic interactions that were previously inaccessible at such large scale due to tissue heterogeneity.

Official Website

Github

Documentation

Publications:

Institutions(s):

Molecular and Human Genetics Department, Baylor College of Medicine, Houston, TX, USA

Top


DNA methylation array analysis

DNA methylation is involved in numerous physiological processes and also disease states, such as cancer (Jones, 2012). This has raised wide interest in developing large-scale DNA methylation profiling technologies to improve our molecular understanding of diseases. The recently released Infinium HumanMethylation450 (Bibikova et al., 2011; Dedeurwaerder et al., 2011) is a preferred technology for studying the DNA methylomes of various cell types in large-scale studies, and there is a current explosion of data generated with this technology (Rakyan et al., 2011). Sequencing-based methods, although offering much higher genome coverage, are still not affordable by all laboratories, notably those with moderate budgets. Another reason for the success of DNA methylation arrays is the ease of reading and understanding the data generated, notably because microarrays have been widely used over the past decades, particularly for gene expression profiling.

Differential Methylation Site Detection

limma | Linear Models for Microarray Data

Provides an integrated solution for analysing data from gene expression experiments. limma contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. It also contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions: (i) it can perform both differential expression and differential splicing analyses of RNA-seq data; (ii) the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences.

Official Website

Documentation

Publications:

Institutions(s):

Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia


FastDMA

A software analyzing Illumina Infinium HumanMethylation450 BeadChip data, which is featured as multiple core parallel computing.

Official Website

Publications:

Institutions(s):

Bioinformatics Division/Center for Synthetic and Systems Biology, Tsinghua National Laboratory for Information Science and Technology (TNLIST), Department of Automation, Tsinghua University, Beijing, China


RnBeads

An R package for comprehensive analysis of DNA methylation data obtained with any experimental protocol that provides single-CpG resolution, including Infinium 450K microarray and bisulfite sequencing protocols, but also MeDIP-seq and MBD-seq.

Official Website

Galaxy

Publications:

Institutions(s):

Max Planck Institute for Informatics, Saarbrücken, Germany

Top


Differential Methylation region Detection

ChAMP

Allows Illumina HumanMethylation BeadChip analysis. ChAMP is an integrated analysis pipeline including functions for (i) filtering low quality probes, adjustment for Infinium I and Infinium II probe design, (ii) batch effect correction, detecting differentially methylated positions (DMPs), (iii) finding differentially methylated regions (DMRs) and (iv) detection of copy number aberrations. The software also allows detection of differentially methylated genomic blocks (DMB) and Gene Set Enrichment Analysis (GSEA).

Official Website

Documentation

Publications:

Institutions(s):

CAS Key Lab of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, China


FastDMA

A software analyzing Illumina Infinium HumanMethylation450 BeadChip data, which is featured as multiple core parallel computing.

Official Website

Publications:

Institutions(s):

Bioinformatics Division/Center for Synthetic and Systems Biology, Tsinghua National Laboratory for Information Science and Technology (TNLIST), Department of Automation, Tsinghua University, Beijing, China


RnBeads

An R package for comprehensive analysis of DNA methylation data obtained with any experimental protocol that provides single-CpG resolution, including Infinium 450K microarray and bisulfite sequencing protocols, but also MeDIP-seq and MBD-seq.

Official Website

Galaxy

Publications:

Institutions(s):

Max Planck Institute for Informatics, Saarbrücken, Germany

Top


Enrichment Analysis

Over-representation analysis

Blast2GO

Permits functional annotation, management, and data mining of novel sequence data. Blast2GO is based on the utilization of common controlled vocabulary schemas, the gene ontology (GO). It takes in consideration similarity, the extension of the homology, the database of choice, the GO hierarchy, and the quality of the original annotations. This tool is suitable for plant genomics research. It generates functional annotation and assesses the functional meaning of their experimental results.

Official Website

Galaxy

Documentation

Publications:

Institutions(s):

Bioinformatics Department, Centro de Investigación Príncipe Felipe, Valencia, Spain


g:Profiler

Provides tool to perform functional enrichment analysis and mine additional information. g:Profiler is a web server that allows to characterize and manipulate gene lists of high-throughput genomics. This tool analyses flat or ranked gene lists for enriched features, converts gene identifiers of different classes, maps genes to orthologous genes in related species, finds similarly expressed genes from public microarray and maps human single nucleotide polymorphisms (SNP) to gene names, chromosomal locations and variant consequence terms from Sequence Ontology (SO).

Official Website

Documentation

Publications:

Institutions(s):

Ontario Institute for Cancer Research, Toronto, ON, Canada; Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada; Institute of Computer Science, University of Tartu, Tartu, Estonia


STEM | Short Time-series Expression Miner

A software program specifically designed for the analysis of short time series microarray gene expression data. STEM implements unique methods to cluster, compare, and visualize such data. STEM also supports efficient and statistically rigorous biological interpretations of short time series data through its integration with the Gene Ontology.

Official Website

Publications:

Institutions(s):

Center for Automated and Learning and Discovery, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA

Top


Gene set enrichment analysis

GSEA | Gene Set Enrichment Analysis

Evaluates microarray data at the level of gene sets. GSEA aims to determine whether members of a gene set S tend to occur toward the top (or bottom) of the list L, in which case the gene set is correlated with the phenotypic class distinction. This method eases the interpretation of a largescale experiment by identifying pathways and processes, and can boost the signal-to-noise ratio when the members of a gene set exhibit strong cross-correlation, allowing to detect modest changes in individual genes.

Official Website

Forum

Publications:

Institutions(s):

Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA; Department of Systems Biology, Harvard Medical School, Boston, MA, USA


DESeq | Differential expression : HTS analysis

Performs differential gene expression analysis. DEseq is a method that integrates methodological advances with features to facilitate quantitative analysis of comparative RNA-seq data using shrinkage estimators for dispersion and fold change. The software is suitable for small studies with few replicates as well as for large observational studies. Its heuristics for outlier detection assist in recognizing genes for which the modeling assumptions are unsuitable and so avoids type-I errors caused by these.

Official Website

Documentation

Publications:

Institutions(s):

Department of Biostatistics and Computational Biology, Dana Farber Cancer Institute, Boston, MA, USA; Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA


edgeR | empirical analysis of DGE in R

Allows differential expression analysis of digital gene expression data. edgeR implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi likelihood tests. The package and methods are general, and can work on other sources of count data, such as barcoding experiments and peptide counts.

Official Website

Galaxy

Documentation

Publications:

Institutions(s):

Cancer Program, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia; Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia

Top


Topology enrichment analysis

DAVID | Database for Annotation, Visualization and Integrated Discovery

Allows users to obtain biological features/meaning associated with large gene or protein lists. DAVID can determine gene-gene similarity, based on the assumption that genes sharing global functional annotation profiles are functionally related to each other. It groups related genes or terms into functional groups employing the similarity distances measure. This tool takes into account the redundant and network nature of biological annotation contents.

Official Website

Publications:

Institutions(s):

Laboratory of Immunopathogenesis and Bioinformatics, Frederick, MD, USA; Advanced Biomedical Computing Center, Frederick, MD, USA


HOMER | Hypergeometric Optimization of Motif EnRichment

Performs peak finding and downstream data analysis for next-generation sequencing analysis. HOMER affords several tools and methods to make use of ChIP-Seq, GRO-Seq, RNA-Seq, DNase-Seq, Hi-C and other types of functional genomics sequencing data sets. This software offers support to UCSC visualization, peaks annotation, quantification of transcripts and repeats or differential features, enrichment and expression.

Official Website

Galaxy

Publications:

Institutions(s):

Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA; Department of Medicine, University of California, San Diego, La Jolla, CA, USA


GeneCodis | Topology enrichment analysis

A web-based tool for the ontological analysis of large lists of genes. It can be used to determine biological annotations or combinations of annotations that are significantly associated to a list of genes under study with respect to a reference list. As well as single annotations, this tool allows users to simultaneously evaluate annotations from different sources, for example Biological Process and Cellular Component categories of Gene Ontology.

Official Website

Publications:

Institutions(s):

Functional Bioinformatics Group, National Center for Biotechnology (CNB-CSIC), Madrid, Spain

Top


Bisulfite Sequencing Data Analysis (BS-seq analysis)

DNA methylation contributes to the epigenetic regulation of many key developmental processes including genomic imprinting, X-inactivation, genome stability and gene regulation. Bisulfite conversion of genomic DNA combined with next-generation sequencing (BS-seq) is widely used to measure the methylation state of a whole genome, the methylome, at single-base resolution (Lister et al., 2009; Bock et al., 2010; Harris et al., 2010).

Differential Methylation region Detection

methylKit | Methylation annotation : BS-seq analysis

An R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. methylKit is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods such as Agilent SureSelect methyl-seq. In addition, methylKit can deal with base-pair resolution data for 5hmC obtained from Tab-seq or oxBS-seq. It can also handle whole-genome bisulfite sequencing data if proper input format is provided.

Official Website

Publications:

Institutions(s):

Department of Physiology and Biophysics, Weill Cornell Medical College, New York, NY, USA


SMART | Specific Methylation Analysis and Report Tool

Detects the cell type-specific methylation marks by integrating multiple methylomes from human cell lines and tissues. SMART is an entropy-based framework focused on integrating of a large number of DNA methylomes for the de novo identification of cell type-specific MethyMarks. To facilitate the specific methylation analysis, this method dynamically integrates multiple methylomes and identifies the cell type-specific methylation marks.

Official Website

Issue

Publications:

Institutions(s):

College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China; Department of Rehabilitation, the First Affiliated Hospital of Harbin Medical University, Harbin, China


DSS-single | Differentially methylated region detection : BS-seq analysis

A package based on a statistical method for detecting DMRs from WGBS (Whole Genome Bisulfite Sequencing) data without replicates. A key feature of DSS-single is to estimate biological variation when replicated data are not available. The method takes advantage of the spatial correlation of methylation levels: since the methylation levels from nearby CpG sites are similar, we can use nearby CpG sites as ‘pseudo-replicates’ to estimate dispersion. Simulations demonstrate that DSS-single has greater sensitivity and accuracy than existing methods, and an analysis of H1 versus IMR90 cell lines suggests that it also yields the most biologically meaningful results.

Official Website

Documentation

Publications:

Institutions(s):

Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA

Top


Methylation Annotation

GBSA | Genome Bisulfite Sequencing Analyser

An open-source software tool capable of analysing whole-genome bisulfite sequencing data with either a gene-centric or gene-independent focus. GBSA’s output can be easily integrated with other high-throughput sequencing data, such as RNA-Seq or ChIP-seq, to elucidate the role of methylated intergenic regions in gene regulation. In essence, GBSA allows an investigator to explore not only known loci but also all the genomic regions, for which methylation studies could lead to the discovery of new regulatory mechanisms.

Official Website

Publications:

Institutions(s):

Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore; Department of Pathology, National University of Singapore, Singapore, Singapore


methylKit | Methylation annotation : BS-seq analysis

An R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. methylKit is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods such as Agilent SureSelect methyl-seq. In addition, methylKit can deal with base-pair resolution data for 5hmC obtained from Tab-seq or oxBS-seq. It can also handle whole-genome bisulfite sequencing data if proper input format is provided.

Official Website

Publications:

Institutions(s):

Department of Physiology and Biophysics, Weill Cornell Medical College, New York, NY, USA

Top


EWAS

ReFACTor | Reference-Free Adjustment for Cell-Type composition

A method based on principal component analysis (PCA) and designed for the correction of cell type heterogeneity in epigenome-wide association studies (EWAS). ReFACTor tool is based on a variant of PCA and can be applied to any tissue. It selects the sites that can be reconstructed with low error using a low-rank approximation of the original methylation matrix. Moreover, ReFACTor does not use the phenotype in the selection process, making ReFACTor useful as part of a quality control step in EWAS.

Official Website

Github

Documentation

Publications:

Institutions(s):

Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv, Israel; Department of Medicine, University of California, San Francisco, CA, USA


FaST-LMM-EWASher

An R version of FaST-LMM-EWASher, which performs epigenome-wide association analysis in the presence of confounders such as cell-type heterogeneity. A python version of this software is also available as part of Fast-LMM-Py.

Official Website

Publications:

Institutions(s):

eScience Research Group, Microsoft Research, Los Angeles, CA, USA; The Broad Institute of MIT and Harvard, Cambridge, MA, USA

Top

Find more tools: OMICTOOLS

Image Citation


Related