I am a beginner in Python and would really appreciate if someone could help me with the following: I would like to run this script 10 times and for that change for every run the sub-batch (from 0-9):
E.g. the first run would be:
python $GWAS_TOOLS/gwas_summary_imputation.py \
-by_region_file $DATA/eur_ld.bed.gz \
-gwas_file $OUTPUT/harmonized_gwas/CARDIoGRAM_C4D_CAD_ADDITIVE.txt.gz \
-parquet_genotype $DATA/reference_panel_1000G/chr1.variants.parquet \
-parquet_genotype_metadata $DATA/reference_panel_1000G/variant_metadata.parquet \
-window 100000 \
-parsimony 7 \
-chromosome 1 \
-regularization 0.1 \
-frequency_filter 0.01 \
-sub_batches 10 \
-sub_batch 0 \
--standardise_dosages \
-output $OUTPUT/summary_imputation_1000G/CARDIoGRAM_C4D_CAD_ADDITIVE_chr1_sb0_reg0.1_ff0.01_by_region.txt.gz
The second run would be
python $GWAS_TOOLS/gwas_summary_imputation.py \
-by_region_file $DATA/eur_ld.bed.gz \
-gwas_file $OUTPUT/harmonized_gwas/CARDIoGRAM_C4D_CAD_ADDITIVE.txt.gz \
-parquet_genotype $DATA/reference_panel_1000G/chr1.variants.parquet \
-parquet_genotype_metadata $DATA/reference_panel_1000G/variant_metadata.parquet \
-window 100000 \
-parsimony 7 \
-chromosome 1 \
-regularization 0.1 \
-frequency_filter 0.01 \
-sub_batches 10 \
-sub_batch 1 \
--standardise_dosages \
-output $OUTPUT/summary_imputation_1000G/CARDIoGRAM_C4D_CAD_ADDITIVE_chr1_sb0_reg0.1_ff0.01_by_region.txt.gz
I am sure this can be done with a loop but not quite sure how to do it in python?
Thank you so much for any advice, Sally
CodePudding user response:
While we can't show you how to retrofit a loop to the python code without actually seeing the python code, you could just use a shell loop to accomplish what you want without touching the python code. For bash shell, it would look like this:
for sub_batch in {0..9}; do \
python $GWAS_TOOLS/gwas_summary_imputation.py \
-by_region_file $DATA/eur_ld.bed.gz \
-gwas_file $OUTPUT/harmonized_gwas/CARDIoGRAM_C4D_CAD_ADDITIVE.txt.gz \
-parquet_genotype $DATA/reference_panel_1000G/chr1.variants.parquet \
-parquet_genotype_metadata
$DATA/reference_panel_1000G/variant_metadata.parquet \
-window 100000 \
-parsimony 7 \
-chromosome 1 \
-regularization 0.1 \
-frequency_filter 0.01 \
-sub_batches 10 \
-sub_batch $sub_batch \
--standardise_dosages \
-output $OUTPUT/summary_imputation_1000G/CARDIoGRAM_C4D_CAD_ADDITIVE_chr1_sb0_reg0.1_ff0.01_by_region.txt.gz
done
CodePudding user response:
So all what I need to do is, is running this:
> for i in range(0, 10):
> python $GWAS_TOOLS/gwas_summary_imputation.py
> -by_region_file $DATA/eur_ld.bed.gz \
> -gwas_file $OUTPUT/harmonized_gwas/CARDIoGRAM_C4D_CAD_ADDITIVE.txt.gz \
> -parquet_genotype $DATA/reference_panel_1000G/chr1.variants.parquet \
> -parquet_genotype_metadata $DATA/reference_panel_1000G/variant_metadata.parquet \
> -window 100000 \
> -parsimony 7 \
> -chromosome 1 \
> -regularization 0.1 \
> -frequency_filter 0.01 \
> -sub_batches 10 \
> -sub_batch i \
> --standardise_dosages \
> -output $OUTPUT/summary_imputation_1000G/CARDIoGRAM_C4D_CAD_ADDITIVE_chr1_sb0_reg0.1_ff0.01_by_region.txt.gz
No "done" or something in the end?
CodePudding user response:
What you're looking for seems to be a way to run the same command on the command line a set number of times.
If you're on Linux using the bash shell, this can be done using a shell loop:
for i in {0..9}; do
python $GWAS_TOOLS/gwas_summary_imputation.py \
-by_region_file $DATA/eur_ld.bed.gz \
-gwas_file $OUTPUT/harmonized_gwas/CARDIoGRAM_C4D_CAD_ADDITIVE.txt.gz \
-parquet_genotype $DATA/reference_panel_1000G/chr1.variants.parquet \
-parquet_genotype_metadata $DATA/reference_panel_1000G/variant_metadata.parquet \
-window 100000 \
-parsimony 7 \
-chromosome 1 \
-regularization 0.1 \
-frequency_filter 0.01 \
-sub_batches 10 \
-sub_batch $i \
--standardise_dosages \
-output $OUTPUT/summary_imputation_1000G/CARDIoGRAM_C4D_CAD_ADDITIVE_chr1_sb0_reg0.1_ff0.01_by_region.txt.gz
done
If you're on Windows, similar can be achieved using powershell:
for ($i=0; $i -le 9; $i ) {
python $GWAS_TOOLS/gwas_summary_imputation.py \
-by_region_file $DATA/eur_ld.bed.gz \
-gwas_file $OUTPUT/harmonized_gwas/CARDIoGRAM_C4D_CAD_ADDITIVE.txt.gz \
-parquet_genotype $DATA/reference_panel_1000G/chr1.variants.parquet \
-parquet_genotype_metadata $DATA/reference_panel_1000G/variant_metadata.parquet \
-window 100000 \
-parsimony 7 \
-chromosome 1 \
-regularization 0.1 \
-frequency_filter 0.01 \
-sub_batches 10 \
-sub_batch $i \
--standardise_dosages \
-output $OUTPUT/summary_imputation_1000G/CARDIoGRAM_C4D_CAD_ADDITIVE_chr1_sb0_reg0.1_ff0.01_by_region.txt.gz
}
CodePudding user response:
a loop in python from 0 to 10 is very easy.
for i in range(0, 10):
do stuff