I have a list that holds names of files, some of which are almost identical except for their timestamp string section. The list is in the format of [name-subname-timestamp] for example:
myList = ['name1-001-20211202811.txt', 'name1-001-202112021010.txt', 'name1-002-202112021010.txt', 'name2-002-202112020811.txt']
What I need is a list that holds for every name and subname, the most recent file derived by the timestamp. I have started by creating a list that holds every [name-subname]:
name_subname_list = []
for row in myList:
name_subname_list.append((row.rpartition('-')[0]))
name_subname_list = set(name_subname_list) # {'name1-001', 'name2-002', 'name1-002'}
Not sure if it is the right approach, moreover I am not sure how to continue. Any ideas?
CodePudding user response:
This code is what you asked for:
For each name-subname, you will have the corresponding newest file:
from datetime import datetime as dt
dic = {}
for i in myList:
sp = i.split('-')
name_subname = sp[0] '-' sp[1]
mytime = sp[2].split('.')[0]
if name_subname not in dic:
dic[name_subname] = mytime
else:
if dt.strptime(mytime, "%Y%m%d%H%M") > dt.strptime(dic[name_subname], "%Y%m%d%H%M"):
dic[name_subname] = mytime
result = []
for name_subname in dic:
result.append(name_subname '-' dic[name_subname] '.txt')
which out puts resutl
to be like:
['name1-001-202112021010.txt',
'name1-002-202112021010.txt',
'name2-002-202112020811.txt']
CodePudding user response:
Try this:
myList = ['name1-001-20211202811.txt', 'name1-001-202112021010.txt', 'name1-002-202112021010.txt', 'name2-002-202112020811.txt']
dic = {}
for name in myList:
parts = name.split('-')
dic.setdefault(parts[0] '-' parts[1], []).append(parts[2])
unique_list = []
for key,value in dic.items():
unique_list.append(key '-' max(value))