Introduction

BEDTools is a suite of open-source utilities for analyzing genomic sequence and coverage from various file types, including BED, SAM and BAM files. Although AltAnalyze can now directly process BAM files to produce BED files, the use of BEDTools may be more efficient when processing dozens to hundreds of BAM files. BEDTools can be easily compiled on Unix, Linux and Mac OS X operating systems.

Details

See the BEDTools documentation for more information.

Usage with AltAnalyze

After installation of BEDTools, AltAnalyze users will need to call the utility bamToBed (recognized on Unix systems once BEDTools has been added to the local or global .bashrc file). The file accepted_hits.bam is produced with each TopHat run in the same output directory as the junction BED file.

In the below example, "hESC_differentiation_exons.bed" is produced by AltAnalyze prior to running BEDTools (see instructions here), containing all known mRNA exon region coordinates from Ensembl/UCSC and all novel exon coordinates indicated from the TopHat junction BED results. These methods should work equivalently for non-TopHat produced BAM files, however, additional sorting of the BAM file may be required (e.g., SAMTools).

Build Exon BED file from BAM

For a Single BAM File

bamToBed -i accepted_hits.bam -split| coverageBed -a stdin
  -b /home/user/BAMtoBED/Hs_cancer_exons.bed >
  /home/user/RNASeqStudy/Sample1/day0_s1__exons.bed

For Many BAM Files (one per folder)

for f in */accepted_hits.bam;
do parentdir=`dirname $f`;
  parentdirname=`basename $parentdir`;
  bamToBed -i $f -split| coverageBed -a stdin
    -b Hs_cancer_exons.bed > ${parentdirname}__exon.bed;
done

This will loop through every folder in the current directory, find the accepted_hits.bam file and name the resulting exon.bed file as "folder_name"exon.bed. Thus, you can obtain all exon.bed files with this single command.