http://www.gly.uga.edu/railsback/FieldImages.htmlMolecularMicro

PAVED- A Software suite for  the analysis of epigenome-derived next generation sequencing data

washUlogo





MAIN

INDEX


ANALYTICAL PIPELINE

CONTACT

SYSTEM REQUIREMENTS

PAVED Package  

Example Data

Insert Size Statistics

 The utility findInsertSizeStatistics finds median insert size and provides information on insert size and the number of fragments that have that insert size. This information helps in deciding on the minimum and maximum tolerable insert sizes to consider fragments for further processing.

Prerequisites

1) Align the fastq files to the genome of interest using your choice of alignment algorithm (BWABOWTIE and Novoalign etc.)
2) Convert to binary alignment map (BAM) format and then sort by using 
Samtools.

How to run it?

Type java -jar PAVED.jar findInsertSizeStatistics -h to see list of parameters

findInsertSizeStatistics utility takes as input a sorted bam file and outputs a insert size statistics file.

Run the utility as follows:
java -jar C:\Britta\manuscript\Analysis\PAVED.jar findInsertSizeStatistics -i "C:\Britta\manuscript\Analysis\BAMFile\ControlRep1Chr5.bam" -o "C:\Britta\manuscript\Analysis\insertSize\ControlRep1Chr5.txt"

Here, "
C:\Britta\manuscript\Analysis\" is the location where the jar file is present on the local disk, -i is the input sorted BAM file and -o is the output file.

Sample output

The sample output of the program is as follows:

insert size output

For this input file, the median insert size is 102. This suggests that majority of reads in this collection have an insert size of 102. This file also presents information on insert size and number of fragments that have that insert size. For example, there are 6 fragments with insert size of 72.