I am trying to group .xls files in list infiles
based on strings in the .xls file names.
The file names are formatted like this "type_d_cross_profile_glacier_name_A-Z" where type_d
is a type of glacier environment, the glacier_name
is each glacier, and the A-Z is the letter of the alphabet representing which cross profile it is (there are multiple per glacier in each type, and there is not always 26 cross profiles).
I would like to group the files first by type (type_a
to type_d
) and then by glacier name so that the A-Z of the cross profiles for each glacier are all grouped together. I think I have to use groupby
, but I can't work out how to use the key, group
aspect with two different strings I want to group by.
I have used a long hand version to group the types:
type_a = [a for a in infiles if "type_a" in a]
type_b = [b for b in infiles if "type_b" in b]
type_c = [c for c in infiles if "type_c" in c]
type_d = [d for d in infiles if "type_d" in d]
which has worked fine, but I am sure there is a more elegant way in which I can group by type, and then by glacier. p.s. (I'm relatively new to python and have adhd so find multi level things are really difficult for me to comprehend; I really appreciate any help!)
CodePudding user response:
Use a dict.
types = {}
for f in infiles:
prefix = '_'.join(f.split('_', 2)[:2]) # could also use regex
types.setdefault(prefix, []).append(f)