http://www.gly.uga.edu/railsback/FieldImages.html

PAVED- A Software suite for the analysis of epigenome-derived next generation sequencing data

MAIN

INDEX

ANALYTICAL PIPELINE

CONTACT

SYSTEM REQUIREMENTS

PAVED Package

Example Data

Identify and Remove PCR duplicates

The utility findRemovePCRDuplicates finds the PCR duplicates and filters them

Prerequisites

1) generate a bam file by aligning the reads to the reference genome of interest and sorting them by genomic position

How to run it?

Type java -jar PAVED.jar findRemovePCRDuplicates finds -h to see list of parameters

findRemovePCRDuplicates utility takes as input a .depth file and converts it into a .wiggle file

Run the utility as follows:

java -jar C:\Britta\manuscript\Analysis\PAVED.jar findRemovePCRDuplicates -i C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.bam -j C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5.pcrDupl -o C:\Britta\manuscript\Analysis\data\BAMFilesInsertSize\FAIRErep2Chr5PCRDuplRem.bam -k 5

Here, "C:\Britta\manuscript\Analysis\" is the location where the jar file is present on the local disk, -i is the input bam file, -j is the output file containing information on PCR duplicates and -o is the bam file after removing PCR duplicates -k is fragments having same start and end positions greater than a specified number (eg.5) that must be eliminated.