Home > Software design >  Output file not written in specific path using awk into a nextflow process
Output file not written in specific path using awk into a nextflow process

Time:05-04

I would like to choose the path of different outputs obtained by AWK into a nextflow process, but i cannot get it. By writing $it after view i can obtain the output in the work directory but i would need to choose the $PATH. I have tried to change $it to "${PathOutdir}/test_out.csv" but doesn't work. Here i put a simple awk function inside the nextflow process. I should use the workflow function? Thanks in advance!

PathFile = "/home/pvellosillo/nextflow_test/test.csv"
InputCsv = file(PathFile)
PathOutdir = "/home/pvellosillo/nextflow_test"

process genesFilter {
tag "PathInputFile:${PathFile}"
input:
   path InputCsv
output:
  file("test_out.csv") into out_filter

shell:

"""
#!/bin/bash
awk 'BEGIN{FS=OFS="\t"}{print \$2}' $InputCsv > "test_out.csv"
"""
}

out_filter.view {"${PathOutdir}/test_out.csv"}

CodePudding user response:

From your question and comment above:

Note that by using publishDir {$PathOutdir} i get an output in a chosen directory but the files are symbolic links to the work directory instead of simply files

I think you are wanting the 'copy' mode, so that the declared output files can be published to the publishDir. Make sure to avoid accessing output files in the publishDir:

Files are copied into the specified directory in an asynchronous manner, so they may not be immediately available in the published directory at the end of the process execution. For this reason, downstream processes should not try to access output files through the publish directory, but through channels.

params.pathFile = "/home/pvellosillo/nextflow_test/test.csv"
params.publishDir = "/home/pvellosillo/nextflow_test"

InputCsv = file( params.pathFile )


process genesFilter {

    tag { InputCsv.name }

    publishDir(
        path: "${params.publishDir}/genesFilter",
        mode: 'copy',
    )

    input:
    path InputCsv

    output:
    path "test_out.csv" into out_filter

    shell:
    '''
    awk 'BEGIN { FS=OFS="\\t" } { print $2 }' "!{InputCsv}" > "test_out.csv"
    '''
}

out_filter.view()

Also note that shell script definitions require the use of single-quote ' delimited strings.

  • Related