Home > Mobile >  MV command isnt moving files in Snakemake Rule
MV command isnt moving files in Snakemake Rule

Time:05-05

I am running this new rule that in the first shell command calls tin.py. This file writes to current directory, which is fine. I want to move these to a new folder called rnaseqc as they are being created or just after. However, only some of them are, the rest error. The error is a set -euo pipefail. What am I doing wrong? What makes this better?

rule tin:
    """
    Quality-control step to infer RNA integrity at the transcript level.
    TINs (transcript integrity numbers) are calculated for all canoncial
    protein-coding transcripts. TIN is analogous to a computationally derived
    RIN value. From the docs: requires a sort and indexed bam file.
    @Input:
        Sorted, duplicate marked genomic BAM file (scatter)
    @Output:
        RSeQC logfiles containing transcript integrity number information
    """
    input:
        bam = rules.picard_dupes.output.bam,
        bai = rules.picard_dupes.output.index,
        control = rules.rseqc.output.Rdist
    output:
        out1 = 'rnaseqc/{sampleID}_marked_duplicates.tin.xls',
        out2 = 'rnaseqc/{sampleID}_marked_duplicates.summary.txt'
    params:
        ## Can I make more user friendly?
        bedref = 'PATH/transcripts.protein_coding_only.bed12',
        outdir = 'rnaseqc'
    run:
        ## Can I make more user friendly?
        shell('python PATH/.local/bin/tin.py -i {input.bam} -r {params.bedref}')
        # moves to rnaseqc directory
        shell('mv *.{txt,xls} {params.outdir}')
        # tried this also and getting index out of range error
        #shell('find . -maxdepth 1 -name "*txt" -or -name "*xls" -exec mv {} {params.outdir}')

CodePudding user response:

My guess is there are issues with running multiple versions of this rule simultaneously. One job is in the middle of running sample1 when the job for sample2 moves all it's files! You can update your move command to just target the sample in question:

shell('mv {wildcards.sampleID}_marked_duplicates.* {params.outdir}')

And generally it's preferable to use a shell directive instead of the shell command in a run block.

  • Related