I am trying to separate a list with similar strings into multiple lists in python.
e.g. lets say the list is:
lst = ["asd_A01_000.csv", "asd_A02_000.csv", "asd_A02_001.csv", "asd_A01_001.csv", "asd_A04_000.csv"]
and I want to have new lists with any new codes like "A01" (so would have A01, A02, A04 etc.) meaning the result I want would be
["asd_A01_000.csv","asd_A01_001.csv"]
["asd_A02_000.csv","asd_A02_001.csv"]
["asd_A04_000.csv"]
The numbers do not have to be in order, as long as they are in different lists.
It is pretty easy to just do this one by one using a for loop where "A01" in list, but I have codes ranging from A01-A100.
Is there an easy way to do this without doing tons of for loops?
P.S The strings are actually full file directory paths which also have _'s in them (e.g C:\Users\Name\Documents\0XX_20220719_XX\asd_A001_000.csv)
CodePudding user response:
One approach:
from collections import defaultdict
lst = ["asd_A01_000.csv", "asd_A02_000.csv", "asd_A02_001.csv", "asd_A01_001.csv", "asd_A04_000.csv"]
d = defaultdict(list)
for e in lst:
d[e.split("_")[1]].append(e)
res = list(d.values())
print(res)
Output
[['asd_A01_000.csv', 'asd_A01_001.csv'], ['asd_A02_000.csv', 'asd_A02_001.csv'], ['asd_A04_000.csv']]
CodePudding user response:
You can try itertools.groupby()
import itertools
lst = sorted(lst, key=lambda asd: asd.split("_")[1])
out = [list(g) for _, g in itertools.groupby(lst, lambda asd: asd.split("_")[1])]
print(out)
[['asd_A01_000.csv', 'asd_A01_001.csv'], ['asd_A02_000.csv', 'asd_A02_001.csv'], ['asd_A04_000.csv']]