http://www.gly.uga.edu/railsback/FieldImages.htmlMolecularMicro

PAVED- A Software suite for  the analysis of epigenome-derived next generation sequencing data

washUlogo





MAIN

INDEX


ANALYTICAL PIPELINE

CONTACT

SYSTEM REQUIREMENTS

PAVED Package  

Example Data

Pipeline to find Nuclease Hypersensitive (NH) sites using FAIRE data

This section describes the detailed steps involved in the analysis of test FAIRE data described in the manuscript "PAVED- A software suite for the analysis of epigenomic next-generation sequencing data" by Jshaik et. al.

Prerequisites

1) Align the fastq files to the genome of interest using your choice of alignment algorithm (BWABOWTIE and Novoalign)
2) Convert to binary alignment map format and sort by genomic position using Samtools.

Experimental Design Followed

FAIREpipeline

Pipeline

To save memory, we are skipping steps 1 and 2 datasets. We have included fragment constructed datasets in the directory .\Analysis\BAMFilesInsertSize. The fragment constructed bam files are a result of steps 1 and 2. The steps 3 onwards can be tried using these datasets.

Step1: Find the median insert size for each of your datasets using the script findInsertSizeStatistics. From the output files generated, draw conclusions on minimum and maximum tolerable insert sizes.

eg. java -jar C:\Britta\manuscript\Analysis\PAVED.jar findInsertSizeStatistics -i "C:\Britta\manuscript\Analysis\data\BAMFile\ControlRep1Chr5.bam" -o "C:\Britta\manuscript\Analysis\data\insertSize\ControlRep1Chr5.txt".
 Do this for both control and experimental datasets

Step2: Construct fragments and include only those fragments that are within a specified insert size.

 eg. java -jar C:\Britta\manuscript\Analysis\PAVED.jar filterBAMbyInsertSize -i "C:\Britta\manuscript\Analysis\data\BAMFile\ControlRep1Chr5.bam" -o "C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\ControlRep1Chr5.bam" -m 104 -n 328
Do this for both control and experimental datasets

Step3: Remove PCR duplicates

 java -jar C:\Britta\manuscript\Analysis\PAVED.jar findRemovePCRDuplicates -i C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.bam -j C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.pcrDupl -o C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5PCRDuplRem.bam -k 5
Do this for both control and experimental datasets

Step4: Find the Fragment depth for each of the samples using bam files generated in step3.

eg. java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findFragmentDepth -i "C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIREinputChr5PCRDuplRem.bam" -o "C:\Britta\manuscript\Analysis\data\FAIRE\FAIREinputChr5PCRDuplRem.depth" -s 1
Do this for both control and experimental datasets

Step5: Find average coverage per chromosome

eg. java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findAverageCoveragePerChromosome -i C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.depth -o C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.covPerChr
Do this for both control and experimental datasets

Step6: Tag low coverage regions

java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar filterOutLowReadDepth -i C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.depth -o C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRemFiltered.depth -j  C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.AvgCov -m 0.2
Do this for both control and experimental datasets

Step7: Perform read count normalization of the coverage files using the average values found in Step 5

java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar readDepthNormalization -i C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRemFiltered.depth -o C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FAIRErep2Chr5PCRDuplRemFilteredNorm.depth -j C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.AvgCov
Do this for both control and experimental datasets

Step8: Normalize FAIRE experimental using FAIRE control

java -jar  C:\Britta\manuscript\Analysis\code\PAVED.jar foldChangeReadDepth -i C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FAIREinputChr5PCRDuplRemFilteredNorm.depth -j C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FAIRErep2Chr5PCRDuplRemFilteredNorm.depth -o C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput -k 0

Step9: Find the peaks

java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findPeaks -i C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput.wig -o C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput.peaks -m 5 -n 10

Step10: Find annotations for the FAIRE peaks

java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findAnnotations -i C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput.peaks -j C:\Britta\manuscript\Analysis\data\annotations\LmajorFriedlin_TriTrypDB-4.0.gff -o C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput.annot