PAVED- A Software suite for the analysis of epigenome-derived next generation sequencing data |
MAIN INDEX ANALYTICAL PIPELINE CONTACT SYSTEM REQUIREMENTS PAVED Package Example Data | Pipeline to find Nuclease Hypersensitive (NH) sites using MNAse dataThis section describes the detailed steps involved in the analysis of test MNAse data described in the manuscript "PAVED- A software suite for the analysis of epigenomic next-generation sequencing data" by Jshaik et. al.Prerequisites1) Align the fastq files to the genome of interest using your choice of alignment algorithm (BWA, BOWTIE and Novoalign)2) Convert to binary alignment map format and sort by genomic position using Samtools. Experimental Design FollowedPipelineTo
save memory, we are skipping steps 1 and 2 datasets. We have included fragment constructed datasets in the directory .\Analysis\BAMFilesInsertSize. The fragment constructed bam files are a
result of steps 1 and 2. The steps 3 onwards can be tried using these datasets. Step1: Find the median insert size for each of your datasets using the script findInsertSizeStatistics. From the output files generated, draw conclusions on minimum and maximum tolerable insert sizes. eg. java -jar C:\Britta\manuscript\Analysis\PAVED.jar findInsertSizeStatistics -i "C:\Britta\manuscript\Analysis\BAMFile\ControlRep1Chr5.bam" -o "C:\Britta\manuscript\Analysis\insertSize\ControlRep1Chr5.txt". Do this for control and experimental datasets Step2: Construct fragments and include only those fragments that are within a specified insert size. eg. java -jar C:\Britta\manuscript\Analysis\PAVED.jar filterBAMbyInsertSize -i "C:\Britta\manuscript\Analysis\BAMFile\ControlRep1Chr5.bam" -o "C:\Britta\manuscript\Analysis\BAMFilesInsertSize\ControlRep1Chr5.bam" -m 104 -n 328 Do this for control and experimental datasets Step3: Remove PCR duplicates java -jar C:\Britta\manuscript\Analysis\PAVED.jar findRemovePCRDuplicates -i C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.bam -j C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.pcrDupl -o C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5PCRDuplRem.bam -k 5 Do this for control and experimental datasets Step4: Find the Fragment depth for each of the samples using bam files generated in step3. java -jar C:\Britta\manuscript\Analysis\PAVED.jar findFragmentDepth -i "C:\Britta\manuscript\Analysis\BAMFilesInsertSize\MNAseRep2Chr5.bam" -o "C:\Britta\manuscript\Analysis\MNAse\MNAseRep2Chr5.depth" -s 1 Do this for control and experimental datasets Step5: Find average coverage per chromosome eg. java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findAverageCoveragePerChromosome -i C:\Britta\manuscript\Analysis\data\MNAse\ShearedFV1Chr5PCRDuplRem.depth -o C:\Britta\manuscript\Analysis\data\MNAse\ShearedFV1Chr5PCRDuplRem.covPerChr Do this for control and experimental datasets Step6: Tag low coverage regions eg. java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar filterOutLowReadDepth -i C:\Britta\manuscript\Analysis\data\MNAse\ShearedFV1Chr5PCRDuplRem.depth -o C:\Britta\manuscript\Analysis\data\MNAse\ShearedFV1Chr5PCRDuplRemFiltered.depth -j C:\Britta\manuscript\Analysis\data\MNAse\ShearedFV1Chr5PCRDuplRem.AvgCov -m 0.2 Do this for control and experimental datasets Step7: Perform read count normalization of the coverage files using the average values found in Step 5 java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar readDepthNormalization -i C:\Britta\manuscript\Analysis\data\MNAse\ShearedFV1Chr5PCRDuplRemFiltered.depth -o C:\Britta\manuscript\Analysis\data\MNAse\normalizedData\ShearedFV1Chr5PCRDuplRemFilteredNorm.depth -j C:\Britta\manuscript\Analysis\data\MNAse\ShearedFV1Chr5PCRDuplRem.AvgCov Do this for control and experimental datasets Step8: Normalize MNAse-chromatin and MNAse-naked DNA using the sheared control java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findValleys -i C:\Britta\manuscript\Analysis\data\MNAse\normalizedData\MNAseVsSheared.wig -o C:\Britta\manuscript\Analysis\data\MNAse\normalizedData\MNAseVsSheared.valleys -m 0.2 -n 10 Repeat this for MNAse experimental Step9: Find the valleys java -jar C:\Britta\manuscript\Analysis\PAVED.jar findValleys -i C:\Britta\manuscript\Analysis\MNAse\normalizedData\MNAseChrVsSheared.wig -o C:\Britta\manuscript\Analysis\MNAse\normalizedData\MNAseChrVsSheared.valleys -m 0.4 -n 10 Repeat this for NakedDNAvsSheared Step10: Find MNAse chromatin specific valleys java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar compareBEDFiles -i C:\Britta\manuscript\Analysis\data\MNAse\normalizedData\ControlVsSheared.valleys -j C:\Britta\manuscript\Analysis\data\MNAse\normalizedData\MNAseVsSheared.valleys -o C:\Britta\manuscript\Analysis\data\MNAse\normalizedData\MNAseSpecific.valleys Step11: Find annotations for the MNAse chromatin specific valleys java -jar C:\Britta\manuscript\Analysis\code\PAVED.jar findAnnotations -i C:\Britta\manuscript\Analysis\data\MNAse\normalizedData\MNAseSpecific.valleys -j C:\Britta\manuscript\Analysis\data\annotations\LmajorFriedlin_TriTrypDB-4.0.gff -o C:\Britta\manuscript\Analysis\data\MNAse\normalizedData\MNAseSpecific.annot |