Identify and Remove PCR duplicates
The utility findRemovePCRDuplicates finds the PCR duplicates and filters them
Prerequisites
1) generate a bam file by aligning the reads to the reference genome of interest and sorting them by genomic position
How
to run it?
Type java -jar PAVED.jar findRemovePCRDuplicates finds -h
to see list of parameters
findRemovePCRDuplicates utility takes as input a .depth file and converts it into a .wiggle file
Run the utility as follows:
java -jar C:\Britta\manuscript\Analysis\PAVED.jar findRemovePCRDuplicates -i C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.bam -j C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.pcrDupl -o C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5PCRDuplRem.bam -k 5
|
Here, "C:\Britta\manuscript\Analysis\" is the location where the
jar file is present on the local disk, -i is
the input bam file, -j is the output file containing information on PCR
duplicates and -o is the bam file after removing PCR duplicates -k is
fragments having same start and end positions greater than a specified
number (eg.5) that must be eliminated.
|