My objective is simple: I'd like to keep certain rules local-only and not upload output to our Amazon S3 Bucket.
Inside the documentation, I see keep_local=True
, which keeps remote files on the local drive after processing. However, this isn't what I'm looking for, as this doesn't prevent the rule from uploading the output to Amazon S3.
Snakemake currently acts like a mirror between Amazon S3 and my local drive.
For reference, this is how we've been setting up Amazon S3 with Snakemake.
# run command
snakemake --default-remote-provider S3 --default-remote-prefix '$s3' --use-conda --cores 32 --rerun-incomplete --printshellcmds
# inside Snakemake
S3 = S3RemoteProvider(access_key_id=config["s3_params"]["access_key_id"], secret_access_key=config["s3_params"]["secret_access_key"])
# example of rule all
#Runs all rules
rule all:
input:
expand(["{sample}.demultiplex_fastqc.zip",
"{sample}.demultiplex_fastqc.html"],
sample=samples["sample"]),
expand(["{sample}.adapterTrim.round2.rmRep.metrics"],
sample=samples["sample"])
# etc...
CodePudding user response:
There are at least two options:
- continue using
snakemake --default-remote-provider S3 --default-remote-prefix '$s3'
and wrap the files that should be kept locally withlocal(some_file)
:
rule some_rule:
output:
local("my_file.txt") # will not be uploaded to s3
- use
snakemake
withouts3
as default provider and explicitly wrap remote files withS3.remote
:
from snakemake.remote.S3 import RemoteProvider as S3RemoteProvider
S3 = S3RemoteProvider()
rule some_rule:
output:
S3.remote("my_file.txt") # will be uploaded to S3