I try to iterate excel files under a directory with code below:
import glob
import pandas as pd
files = glob.glob('./*.xlsx')
for file_path in files:
print(file_path)
Out:
./data\S273-2021-12-09.xlsx
./data\S357-2021-12-09.xlsx
./data\S545-2021-12-09.xlsx
./data\S607-2021-12-09.xlsx
Now I hope to replace S273
, S357
, etc. based dataframe df
to map old_name
to new_name
:
old_name new_name
0 S273 a
1 S357 b
2 S545 c
3 S607 d
4 S281 e
To convert dataframe to dictionary if necessary: name_dict = dict(zip(df.old_name, df.new_name))
The expected result will like:
./data\a-2021-12-09.xlsx
./data\b-2021-12-09.xlsx
./data\c-2021-12-09.xlsx
./data\d-2021-12-09.xlsx
How could I achieve that in Python? Sincere thanks at advance.
EDIT:
for file_path in files:
for key, value in name_dict.items():
if key in str(file_path):
new_path = file_path.replace(key, value)
print(new_path)
The code above works, welcome to share other solutions if it's possible.
CodePudding user response:
You can split basename first by os.path.split
, then first part of name of file by -
and mapping by dict.get
, if no match is return same value, so second argument is also first
:
import os
name_dict = dict(zip(df.old_name, df.new_name))
print (name_dict)
{'S273': 'a', 'S357': 'b', 'S545': 'c', 'S607': 'd', 'S281': 'e'}
#for test
L = './data\S273-2021-12-09.xlsx ./data\S357-2021-12-09.xlsx ./data\S545-2021-12-09.xlsx ./data\S607-2021-12-09.xlsx'
files = L.split()
for file_path in files:
head, tail = os.path.split(file_path)
first, last = tail.split('-', 1)
out = os.path.join(head, f'{name_dict.get(first, first)}-{last}')
print(out)
./data\a-2021-12-09.xlsx
./data\b-2021-12-09.xlsx
./data\c-2021-12-09.xlsx
./data\d-2021-12-09.xlsx