ChipEnrich

ChIP-Enrich: Gene set enrichment testing for ChIP-seq data
Overview
ChIP-Enrich tests ChIP-seq peak data for enrichment of biological pathways, Gene Ontology terms, and other types of gene sets. The pre-defined gene sets are the same as used in LRpath, and can be browsed here. Using an input .BED file, ChIP-Enrich assigns peaks to genes based on a chosen "locus definition". The "locus" of a gene is the region from which the gene is predicted to be regulated. ChIP-Enrich uses a logistic regression model to test for association between the presence of at least one peak in a gene and gene set membership. It empirically adjusts for the relationship between the length of the loci (and optionally mappability) and the presence of least one peak in a gene using a binomial cubic smoothing spline term within the logistic model. Detailed methods are provided here. Output includes summary plots, peak to gene assignments,and enrichment (and depletion) results including odds ratio, p-value, and FDR for each gene set.
Broad-Enrich If your data set consists of broad genomic regions or covers a significant portion of the total genome, we recommend using Broad-Enrich instead.


ENCODE enrichment testing results can be accessed here

Select input file

ChIP-Enrich supports .bed.gz, .bed, .broadPeak, .narrowPeak, .bed.gff, and .bed.gff3 files. Files with any other extension should have a header row including "chrom", "start", and "end" to indicate chromosomal locations.

Analysis Name
Please provide a meaningful name for this analysis (used to name output files).

Email
Please provide your email address if you wish to be notified when the analysis has been completed.

Supported Genomes



Annotation Databases Selecting multiple, or a large, annotation database may require several minutes of computation time. For approximate Chip-Enrich running times against different databases view this table.

Filter Only test gene sets with less than the following number of genes:
Filter value should be numeric and greater than 30.It can be used to remove large, vague gene sets such as "binding".

Peak Threshold Number
Number of peaks a gene must have assigned to it before getting coded as 1 (having a peak) in the test. Typically, this should be set to 1.

Enrichment Method
  • Chip-Enrich
  • Fisher's exact test
We recommend using Fisher's Exact test only with the 1kb or 5kb locus definition. Using it with any of the other locus definitions may result in biased enrichment results.
Locus Definition
  • < 1kb
    (only use peaks within 1kb of a transcription start site)
  • < 5kb
    (only use peaks within 5kb of a transcription start site)
  • < 10kb
    (only use peaks within 10kb of a transcription start site)
  • > 10kb and more upstream
    (only use peaks within 10kb and more upstream of a transcription start site)
  • Exon
    (only use peaks that fall within an annotated exon)
  • Intron
    (only use peaks that fall within an annotated itron)
  • Nearest Gene
    (use all peaks; assign peaks to the nearest gene defined by transcription start and end sites)
  • Nearest TSS
    (use all peaks; assign peaks to the gene with the closest TSS)
  • User Defined
    (user can input their own locus definition)
Adjust for the mappability of the gene locus regions
  • True
  • False
 

Reference
Please reference the following publication when citing Chip-Enrich:

1 R.P. Welch, C. Lee, R.A. Smith,S. Patil, T. Weymouth, P. Imbriano, L.J. Scott, M.A. Sartor. "ChIP-Enrich: Gene set enrichment testing for ChIP-seq data." NAR. 2014.

Change log for this page can be accessed here
For support and questions email: snehal@med.umich.edu


Copyright 2013 The University of Michigan
Developed under the support of the NIH/NCI
Grant # R01-CA158286-01A1
Terms of Use