PAVED- A Software suite for the analysis of epigenome-derived next generation sequencing data |
MAIN INDEX ANALYTICAL PIPELINE CONTACT SYSTEM REQUIREMENTS PAVED Package Example Data | Insert Size StatisticsThe utility findInsertSizeStatistics finds median insert size and provides information on insert size and the number of fragments that have that insert size. This information helps in deciding on the minimum and maximum tolerable insert sizes to consider fragments for further processing.Prerequisites1) Align the fastq files to the genome of interest using your choice of alignment algorithm (BWA, BOWTIE and Novoalign etc.)2) Convert to binary alignment map (BAM) format and then sort by using Samtools. How to run it?Type java -jar PAVED.jar findInsertSizeStatistics -h to see list of parametersfindInsertSizeStatistics utility takes as input a sorted bam file and outputs a insert size statistics file. Run the utility as follows:
Here, "C:\Britta\manuscript\Analysis\" is the location where the jar file is present on the local disk, -i is the input sorted BAM file and -o is the output file. Sample outputThe sample output of the program is as follows:For this input file, the median insert size is 102. This suggests that majority of reads in this collection have an insert size of 102. This file also presents information on insert size and number of fragments that have that insert size. For example, there are 6 fragments with insert size of 72. |