Home > Software design >  How to get the names of the files using glob function in python?
How to get the names of the files using glob function in python?

Time:09-23

I have some files in a directory : SRR01231_1.fastq SRR01231_2.fastq SRR01232_1.fastq SRR01232_2.fastq SRR01233_1.fastq SRR01233_2.fastq

I am writing a snakemake workflow to do some analysis on these files. For that i need the names of the files in this directory. I am trying to get them by glob function. I am not able to properly utilise the glob function.

sample code i wrote:

import glob
srr, fr = glob.glob({id} '_' {int} 'fastq')

The output I am expecting is, id (i.e., SRR1231) to be saved to srr and the int following to be saved as fr.

Is it possible to use some other function to do the same?

Any suggestions or help is appreciated.

CodePudding user response:

You can use pathlib.Path and its glob method to parse such info:

import pathlib

fastq_paths = pathlib.Path("/path/to/your/fastq-files").glob("*.fastq")

for path in fastq_paths:
    srr, fr = path.stem.split("_")
    print(srr, fr)
  • Related