Seperating list of strings with similarities into different lists-CodePudding

I am trying to separate a list with similar strings into multiple lists in python.

e.g. lets say the list is:

lst = ["asd_A01_000.csv", "asd_A02_000.csv", "asd_A02_001.csv", "asd_A01_001.csv", "asd_A04_000.csv"]

and I want to have new lists with any new codes like "A01" (so would have A01, A02, A04 etc.) meaning the result I want would be

["asd_A01_000.csv","asd_A01_001.csv"]
["asd_A02_000.csv","asd_A02_001.csv"]
["asd_A04_000.csv"]

The numbers do not have to be in order, as long as they are in different lists.

It is pretty easy to just do this one by one using a for loop where "A01" in list, but I have codes ranging from A01-A100.

Is there an easy way to do this without doing tons of for loops?

P.S The strings are actually full file directory paths which also have _'s in them (e.g C:\Users\Name\Documents\0XX_20220719_XX\asd_A001_000.csv)

CodePudding user response：

One approach:

from collections import defaultdict

lst = ["asd_A01_000.csv", "asd_A02_000.csv", "asd_A02_001.csv", "asd_A01_001.csv", "asd_A04_000.csv"]

d = defaultdict(list)

for e in lst:
    d[e.split("_")[1]].append(e)

res = list(d.values())
print(res)

Output

[['asd_A01_000.csv', 'asd_A01_001.csv'], ['asd_A02_000.csv', 'asd_A02_001.csv'], ['asd_A04_000.csv']]

CodePudding user response：

You can try itertools.groupby()

import itertools

lst = sorted(lst, key=lambda asd: asd.split("_")[1])
out = [list(g) for _, g in itertools.groupby(lst, lambda asd: asd.split("_")[1])]

print(out)

[['asd_A01_000.csv', 'asd_A01_001.csv'], ['asd_A02_000.csv', 'asd_A02_001.csv'], ['asd_A04_000.csv']]