hope you can help me solve my issue or tell me to submit report
I'm 'importing' a snakemake module from github in another snakefile, which is local. This appears to mess up the target of the local snakefile. When the 2nd snakefile is imported the target is no longer the one specified by rule 'all', but by some arbitrary (?) rule in the imported snakefile, even when the imported snakefile does not contain any relevant rules.
I've compiled an example set of two repo's on github which suffer from this problem (lpagie/repo1 and lpagie/repo2). From the repo1/readme.md:
==============
This repo is setup to illustrate a problem (?) with using snakemake modules from github
Clone this repo locally and run snakemake from a directory above the cloned
repo, using the wrapper run.sh
This snakefile will 'import' lpagie/repo2, which in current form only contains
outcommented rules and a rule which is (supposedly) not meaningful for repo1.
Running the snakemake of repo1 will not generate the output specified by rule
'all' (output/final
) but instead output generated by rule 'non-sense' ....
When the import of the repo2 module is outcommented from repo1/snakefile_1.smk, running the snakemake generates the expected outcome.
=============
Am I overlooking something obvious?
I'm using snakemake V 6.9.1, installed in conda
Here's the output I get running a clean install of repo1 and running the 'repo1/run.sh':
git clone [email protected]:lpagie/repo1.git
git clone [email protected]:lpagie/repo2.git
bash repo1/run.sh
repo_dir = /data/home/ludo/projects/20211005_test_snakemake_submodules/test_repos/repo1
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
------------ ------- ------------- -------------
wf2_nonsense 1 1 1
total 1 1 1
Select jobs to execute...
[Wed Oct 6 17:04:44 2021]
rule wf2_nonsense:
input: /tmp/tmph_6w4l9asnakemake-runtime-source-cache/bfcfa05f3052febb0b88b59991e4aac562b3465cfdb8f8d288a357884ae7572b
output: output/nonsense.out
jobid: 0
reason: Missing output files: output/nonsense.out
resources: tmpdir=/tmp
/data/home/ludo/miniconda3/bin/python3.8 /data/home/ludo/projects/20211005_test_snakemake_submodules/test_repos/.snakemake/scripts/tmpu8huybi8.touch.py
repo2
/data/home/ludo/projects/20211005_test_snakemake_submodules/test_repos
[Wed Oct 6 17:04:45 2021]
Finished job 0.
1 of 1 steps (100%) done
Complete log: /data/home/ludo/projects/20211005_test_snakemake_submodules/test_repos/.snakemake/log/2021-10-06T170442.797027.snakemake.log
Same after out commenting the lines importing the repos2 module:
vi repo1/snakefile_1.smk
bash repo1/run.sh
repo_dir = /data/home/ludo/projects/20211005_test_snakemake_submodules/test_repos/repo1
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
----- ------- ------------- -------------
A 1 1 1
B 1 1 1
all 1 1 1
total 3 1 1
Select jobs to execute...
[Wed Oct 6 17:08:18 2021]
rule B:
output: output/fB
jobid: 2
reason: Missing output files: output/fB
resources: tmpdir=/tmp
bash /data/home/ludo/projects/20211005_test_snakemake_submodules/test_repos/repo1/scripts/touch.sh output/fB
[Wed Oct 6 17:08:18 2021]
Finished job 2.
1 of 3 steps (33%) done
Select jobs to execute...
[Wed Oct 6 17:08:18 2021]
rule A:
input: output/fB
output: output/final
jobid: 1
reason: Missing output files: output/final; Input files updated by another job: output/fB
resources: tmpdir=/tmp
bash /data/home/ludo/projects/20211005_test_snakemake_submodules/test_repos/repo1/scripts/touch.sh output/final
[Wed Oct 6 17:08:18 2021]
Finished job 1.
2 of 3 steps (67%) done
Select jobs to execute...
[Wed Oct 6 17:08:18 2021]
localrule all:
input: output/final
jobid: 0
reason: Input files updated by another job: output/final
resources: tmpdir=/tmp
[Wed Oct 6 17:08:18 2021]
Finished job 0.
3 of 3 steps (100%) done
Complete log: /data/home/ludo/projects/20211005_test_snakemake_submodules/test_repos/.snakemake/log/2021-10-06T170818.178572.snakemake.l
og
I created lpagie/repo3 which is the copy of repo1 but out commented lines which otherwise import the repo2 module.
CodePudding user response:
Your code to import rules from the remote module comes before rule all
. Therefore whichever rule is imported first determines the final output of the pipeline.
So just put the import after rule all
. Instead of this:
module other_workflow:
snakefile: github("lpagie/repo2", path="snakefile_2.smk", commit="61f60f7")
config: config
use rule * from other_workflow as wf2_*
rule all:
input:
"output/final"
Try:
module other_workflow:
snakefile: github("lpagie/repo2", path="snakefile_2.smk", commit="61f60f7")
config: config
rule all:
input:
"output/final"
use rule * from other_workflow as wf2_*
(As an aside, it appears the github
function was a recent addition and this will work with snakemake >=6.9)