Cluster Flow Examples Cluster Flow

You can see some static examples of what to expect from Cluster Flow below.
If you like, you can try these out in the online demo.

Finding out what's available with Cluster Flow

Cluster Flow has a number of functions to list installed pipelines, modules and reference genomes:

cf --pipelines 
================================
Cluster Flow - available pipelines
================================
Installed pipelines:
    Directory ./
    Directory /Users/demouser/.clusterflow/pipelines/ (not found)
    Directory /Users/demouser/Work/Cluster_Flow/clusterflow/pipelines/
	- bam_preseq
	- bismark
	- bismark_RRBS
	- bismark_pbat
	- bismark_singlecell
	- bwa_preseq
	- chipseq_qc
	- fastq_bismark
	- fastq_bismark_RRBS
	- fastq_bowtie
	- fastq_hicup
	- fastq_hisat2
	- fastq_pbat
	- fastq_star
	- fastq_tophat
	- sra_bismark
	- sra_bismark_RRBS
	- sra_bowtie
	- sra_bowtie1
	- sra_bowtie2
	- sra_bowtie_miRNA
	- sra_hicup
	- sra_hisat2
	- sra_pbat
	- sra_tophat
	- sra_trim
	- trim_bowtie_miRNA
	- trim_tophat
cf --modules 
================================
Cluster Flow - available modules
================================
Available modules:
    Directory ./
    Directory /Users/demouser/.clusterflow/modules/ (not found)
    Directory /Users/demouser/Work/Cluster_Flow/clusterflow/modules/
	- bedToNrf
	- bedtools_bamToBed
	- bedtools_intersectNeg
	- bismark_align
	- bismark_deduplicate
	- bismark_methXtract
	- bismark_report
	- bismark_summary_report
	- bowtie
	- bowtie1
	- bowtie2
	- bwa
	- cf_download
	- cf_merge_files
	- cf_run_finished
	- cf_runs_all_finished
	- deeptools_bamCoverage
	- deeptools_bamFingerprint
	- fastq_screen
	- fastqc
	- featureCounts
	- hicup
	- hisat2
	- htseq_counts
	- kallisto
	- multiqc
	- phantompeaktools_runSpp
	- picard_dedup
	- preseq_calc
	- rseqc_geneBody_coverage
	- rseqc_inner_distance
	- rseqc_junctions
	- rseqc_read_GC
	- samtools_bam2sam
	- samtools_dedup
	- samtools_sort_index
	- sra_abidump
	- sra_fqdump
	- star
	- tophat
	- tophat_broken_MAPQ
	- trim_galore
cf --genomes 
================================
Cluster Flow - available genomes
================================

-------------------------------------------------------------------------------------
 /home/demouser/.clusterflow/genomes.config
 Name    Type     Species       Assembly  Path
-------------------------------------------------------------------------------------
 GRCh37  bed12    Homo_sapiens  GRCh37    /genomes/Homo_sapiens/Genes/genes.bed12
 GRCh37  bismark  Homo_sapiens  GRCh37    /genomes/Homo_sapiens/BismarkIndex
 GRCh37  bowtie   Homo_sapiens  GRCh37    /genomes/Homo_sapiens/BowtieIndex/genome
 GRCh37  bowtie2  Homo_sapiens  GRCh37    /genomes/Homo_sapiens/Bowtie2Index/genome
 GRCh37  bwa      Homo_sapiens  GRCh37    /genomes/Homo_sapiens/BWAIndex/genome.fa
 GRCh37  fasta    Homo_sapiens  GRCh37    /genomes/Homo_sapiens/WholeGenomeFasta
 GRCh37  gtf      Homo_sapiens  GRCh37    /genomes/Homo_sapiens/Genes/genes.gtf
 GRCh37  star     Homo_sapiens  GRCh37    /genomes/Homo_sapiens/STARIndex/
-------------------------------------------------------------------------------------
 GRCm38  bed12    Mus_musculus  GRCm38    /genomes/Mus_musculus/Genes/genes.bed12
 GRCm38  bismark  Mus_musculus  GRCm38    /genomes/Mus_musculus/BismarkIndex
 GRCm38  bowtie   Mus_musculus  GRCm38    /genomes/Mus_musculus/BowtieIndex/genome
 GRCm38  bowtie2  Mus_musculus  GRCm38    /genomes/Mus_musculus/Bowtie2Index/genome
 GRCm38  bwa      Mus_musculus  GRCm38    /genomes/Mus_musculus/BWAIndex/genome.fa
 GRCm38  fasta    Mus_musculus  GRCm38    /genomes/Mus_musculus/WholeGenomeFasta
 GRCm38  gtf      Mus_musculus  GRCm38    /genomes/Mus_musculus/Genes/genes.gtf
-------------------------------------------------------------------------------------

Running pipelines

You can run modules and pipelines with varying numbers of extra parameters, to give increasing degrees of flexibility:

cf sra_trim *.sra
cf samtools_sort_index *.bam
cf --genome GRCh37 fastq_tophat *fastq.gz
cf --genome NCBIM37 fastq_bismark ftp://fileserver.edu/data/sample1.fq
cf --paired --project a2015930 --genome s.pombe --file_list downloads.txt fastq_tophat

Monitoring progress

Once running, Cluster Flow has a number of tools to help you keep track of your jobs:

cf --qstat 
======================================================================
 Cluster Flow Pipeline: fastq_bismark
 Submitted:             14 seconds ago
 Working Directory:     /home/clusterflow/public_html/demo/demofiles
 Cluster Flow ID:       fastq_bismark_1432818534
======================================================================

 -  bismark_align                      [8 cores]
      - bismark_deduplicate
           - bismark_methXtract
               - bismark_report

 -  trim_galore                        [3 cores]
      - bismark_align
           - bismark_deduplicate
                - bismark_methXtract
                     - bismark_report

 -  trim_galore                        [3 cores]
      - bismark_align
           - bismark_deduplicate
                - bismark_methXtract
                     - bismark_report
                          - email_run_complete
                               - bismark_summary_report

 -  trim_galore                        [3 cores]  [queued, priority 100000]
      - bismark_align
           - bismark_deduplicate
                - bismark_methXtract
                     - bismark_report
Notification e-mails when a pipeline completes. 

Pipelines and Modules

To find out more about specific pipelines and modules that come bundled with Cluster Flow, click the buttons below:

Modules

Pipelines

Further Information

To find out more about Cluster Flow - have a look at the online demo, read the documentation or download a copy and try it out!