For loop and replace part string in paths based on dataframe two columns in Python-CodePudding

I try to iterate excel files under a directory with code below:

import glob
import pandas as pd

files = glob.glob('./*.xlsx')

for file_path in files:
    print(file_path)

Out:

./data\S273-2021-12-09.xlsx
./data\S357-2021-12-09.xlsx
./data\S545-2021-12-09.xlsx
./data\S607-2021-12-09.xlsx

Now I hope to replace S273, S357, etc. based dataframe df to map old_name to new_name:

  old_name new_name
0     S273        a
1     S357        b
2     S545        c
3     S607        d
4     S281        e

To convert dataframe to dictionary if necessary: name_dict = dict(zip(df.old_name, df.new_name))

The expected result will like:

./data\a-2021-12-09.xlsx
./data\b-2021-12-09.xlsx
./data\c-2021-12-09.xlsx
./data\d-2021-12-09.xlsx

How could I achieve that in Python? Sincere thanks at advance.

EDIT:

for file_path in files:
    for key, value in name_dict.items():
        if key in str(file_path):
            new_path = file_path.replace(key, value)
            print(new_path)

The code above works, welcome to share other solutions if it's possible.

CodePudding user response：

You can split basename first by os.path.split, then first part of name of file by - and mapping by dict.get, if no match is return same value, so second argument is also first:

import os

name_dict = dict(zip(df.old_name, df.new_name))
print (name_dict)
{'S273': 'a', 'S357': 'b', 'S545': 'c', 'S607': 'd', 'S281': 'e'}

#for test
L = './data\S273-2021-12-09.xlsx ./data\S357-2021-12-09.xlsx ./data\S545-2021-12-09.xlsx ./data\S607-2021-12-09.xlsx' 
files = L.split()

for file_path in files:
    head, tail = os.path.split(file_path)
    first, last = tail.split('-', 1)
    out = os.path.join(head, f'{name_dict.get(first, first)}-{last}')
    print(out)
    
./data\a-2021-12-09.xlsx
./data\b-2021-12-09.xlsx
./data\c-2021-12-09.xlsx
./data\d-2021-12-09.xlsx