PAVED- A Software suite for the analysis of epigenome-derived next generation sequencing data |
MAIN INDEX ANALYTICAL PIPELINE CONTACT SYSTEM REQUIREMENTS PAVED Package Example Data |
Pipeline to find Nuclease Hypersensitive (NH) sites
using FAIRE data This section describes the detailed
steps involved in the analysis of test FAIRE data described in the manuscript
"PAVED- A software suite for the analysis of epigenomic next-generation
sequencing data" by Jshaik et. al. Prerequisites 1) Align the fastq files to the
genome of interest using your choice of alignment algorithm (BWA, BOWTIE and Novoalign) Experimental
Design Followed Pipeline To
save memory, we are skipping steps 1 and 2 datasets. We have included fragment constructed datasets in the directory
.\Analysis\BAMFilesInsertSize. The fragment constructed bam files are a result
of steps 1 and 2. The steps 3 onwards can be tried using these datasets. Step1: Find the median insert size for each of your datasets using
the script findInsertSizeStatistics. From the output files
generated, draw conclusions on minimum and maximum tolerable insert sizes.
eg. java -jar C:\Britta\manuscript\Analysis\PAVED.jar findInsertSizeStatistics -i "C:\Britta\manuscript\Analysis\data\BAMFile\ControlRep1Chr5.bam" -o "C:\Britta\manuscript\Analysis\data\insertSize\ControlRep1Chr5.txt". Do this for both control and experimental datasets Step2: Construct fragments and include only those fragments that are within a specified insert size. eg. java -jar C:\Britta\manuscript\Analysis\PAVED.jar filterBAMbyInsertSize -i "C:\Britta\manuscript\Analysis\data\BAMFile\ControlRep1Chr5.bam" -o "C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\ControlRep1Chr5.bam" -m 104 -n 328 Do this for both control and experimental datasets Step3: Remove PCR duplicates java -jar C:\Britta\manuscript\Analysis\PAVED.jar findRemovePCRDuplicates -i C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.bam -j C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.pcrDupl -o C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5PCRDuplRem.bam -k 5 Do this for both control and experimental datasets Step4: Find the Fragment depth for each of the samples using bam files generated in step3. eg. java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findFragmentDepth -i "C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIREinputChr5PCRDuplRem.bam" -o "C:\Britta\manuscript\Analysis\data\FAIRE\FAIREinputChr5PCRDuplRem.depth" -s 1 Do this for both control and experimental datasets Step5: Find average coverage per chromosome eg. java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findAverageCoveragePerChromosome -i C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.depth -o C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.covPerChr Do this for both control and experimental datasets Step6: Tag low coverage regions java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar filterOutLowReadDepth -i C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.depth -o C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRemFiltered.depth -j C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.AvgCov -m 0.2 Do this for both control and experimental datasets Step7: Perform read count normalization of the coverage files using the average values found in Step 5 java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar readDepthNormalization -i C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRemFiltered.depth -o C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FAIRErep2Chr5PCRDuplRemFilteredNorm.depth -j C:\Britta\manuscript\Analysis\data\FAIRE\FAIRErep2Chr5PCRDuplRem.AvgCov Do this for both control and experimental datasets Step8: Normalize FAIRE experimental using FAIRE control java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar foldChangeReadDepth -i C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FAIREinputChr5PCRDuplRemFilteredNorm.depth -j C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FAIRErep2Chr5PCRDuplRemFilteredNorm.depth -o C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput -k 0 Step9: Find the peaks java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findPeaks -i C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput.wig -o C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput.peaks -m 5 -n 10 Step10: Find annotations for the FAIRE peaks java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findAnnotations -i C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput.peaks -j C:\Britta\manuscript\Analysis\data\annotations\LmajorFriedlin_TriTrypDB-4.0.gff -o C:\Britta\manuscript\Analysis\data\FAIRE\normalizedData\FaireRepvsInput.annot |