strapdown: '/styling/strapdown/v/0.2/strapdown.js'};
<h2>Contents</h2>
<p>This directory contains processed data files from</p>
<p>
<a href="http://www.plantcell.org/content/23/4/1293.full">
Prediction of Regulatory Interactions from Genome Sequences Using a Biophysical Model for the Arabidopsis LEAFY Transcription Factor</a>
</p>
<p>
Data files were processed using steps adapted from iPlant tutorial:
</p>
<p><a href="https://pods.iplantcollaborative.org/wiki/x/WBK">Using the iPlant Discovery Environment and Atmosphere for Advanced ChIP-seq Analysis</a>

[ICO]NameLast modifiedSizeDescription
[PARENTDIR]Parent Directory  -  
[   ]LFY_filtered_summit_extended.bed.gz.tbi2017-07-25 14:59 9.8KTabix index file
[   ]LFY.bedgraph.gz.tbi2017-07-25 14:59 85KBedGraph (wiggle) format file
[   ]LFY.bedgraph.gz2017-07-25 14:59 27MBedGraph (wiggle) format file
[   ]CNTRL.bedgraph.gz.tbi2017-07-25 15:00 69KBedGraph (wiggle) format file
[   ]CNTRL.bedgraph.gz2017-07-25 15:00 6.0MBedGraph (wiggle) format file
[   ]LFY_filtered_summit_extended.bed.gz2017-07-25 14:59 24KAnnotation or junction file
[DIR]peakranger_results/2017-07-25 15:00 -  
[   ]LFY.bam.bai2017-07-25 15:00 281K 
[   ]LFY.bam2017-07-25 14:59 665M 
[   ]LFY-binding-sites.pptx2017-07-25 15:00 708K 
[   ]CNTRL.bam.bai2017-07-25 14:59 256K 
[   ]CNTRL.bam2017-07-25 15:00 264M 
<h2>Data processing protocol</h2>
<p>
Steps starting with $ indicate commands executed at the UNIX prompt.
</p>
<hr>
<pre>
Get files from Short Read Archive and save to new local directory SRP003928

  $ wget -nd -nH -r -P SRP003928 ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP003/SRP003928

Convert to fastq:
  
  $ fastq-dump *.sra

Creates four files:

  SRR070382.fastq
  SRR070383.fastq
  SRR070384.fastq
  SRR070385.fastq

For each file, do the following:

Align onto genome using soap 2.21:

  $ soap -v 2 -r 0 -p 4 -D A_thaliana_Jun_2009.fa.index -a SRR070382.fastq -o SRR070382.out

Convert output to sam format using soap2sam.pl from soap distribution.

  $ soap2sam.pl SRR070382.out > SRR070382.sam

Convert .sam files to sorted, indexed bam files.

  $ samtools view -t genome.txt -b -u -S SRR070382.sam -o - | samtools sort - SRR070382 

Note: genome.txt is <a href="http://igbquickload.org/quickload/A_thaliana_Jun_2009/genome.txt">genome.txt</a>.

Merge Control and IP sample alignment files:

  $ samtools merge CNTRL.bam SRR070382.bam SRR070383.bam
  $ samtools index CNTRL.bam
  $ samtools merge LFY.bam SRR070384.bam SRR070385.bam
  $ samtools index LFY.bam

Run peakranger 1.16 as in iPlant tutorial <a href="https://pods.iplantcollaborative.org/wiki/x/WBK">https://pods.iplantcollaborative.org/wiki/x/WBK</a>.

  $ peakranger ranger --format bam --ext_length 200 --FDR 0.01 -d LFY.bam -c CNTRL.bam -o LFY1 -t 4 --delta 0.8 --bandwidth 99

Creates files:

  LFY_details
  LFY_region.bed
  LFY_summit.bed

Filter pvalue 0 peaks from LFY1_summit.bed:

  $ grep pval_0 LFY_summit.bed > LFY_summit_filtered.bed

Expand summits 50 bases on both (-b) sides:

  $ slopBed -b 50 -i LFY_summit_filtered.bed -g genome.txt > LFY_summit_filtered_extended.bed

slopBed is a program from the BedTools suite. Google bedtools for more information.

Sort and index the summit file for distribution:

  $ sort -n1,1 -n2,2n LFY_summit_filtered.bed | bgzip > LFY_summit_filtered_extended.bed.gz 
  $ tabix -p bed LFY_summit_filtered_extended.bed.gz

Use PeakRanger to make coverage graph files from the bam format files:

  $ peakranger wigpe -d LFY.bam -o LFY

This makes LFY.wig

Sort, compress and index, after removing track and comment lines:

  $ grep -v '#' LFY.wig | grep -v 'track' | sort -k1,1 -k2,2n | bgzip > LFY.bedgraph.gz
  $ tabix -p bed LFY.bedgraph.gz

Repeat for the control. 

C'est tout!

</pre>