I have some files in a directory : SRR01231_1.fastq SRR01231_2.fastq SRR01232_1.fastq SRR01232_2.fastq SRR01233_1.fastq SRR01233_2.fastq
I am writing a snakemake workflow to do some analysis on these files. For that i need the names of the files in this directory. I am trying to get them by glob function. I am not able to properly utilise the glob function.
sample code i wrote:
import glob
srr, fr = glob.glob({id} '_' {int} 'fastq')
The output I am expecting is, id (i.e., SRR1231) to be saved to srr and the int following to be saved as fr.
Is it possible to use some other function to do the same?
Any suggestions or help is appreciated.
CodePudding user response:
You can use pathlib.Path
and its glob
method to parse such info:
import pathlib
fastq_paths = pathlib.Path("/path/to/your/fastq-files").glob("*.fastq")
for path in fastq_paths:
srr, fr = path.stem.split("_")
print(srr, fr)