I am trying to understand how i can add to my current script where im able to make changes at sheet level. I want to update names of columns and map them to new names. For example, if a column is called 'locationname' i want to update the column name for all sheets within the file as 'company name'
My code so far where I have managed to delete a specific sheet from a file-
import pandas as pd
from openpyxl import load_workbook
for file in os.listdir("C:/Users/yam/Desktop/hhh/python/sterls"):
if file.startswith("TVC"):
wb = load_workbook(file)
if 'Opt-Ins' in wb.sheetnames:
wb.remove(wb['Opt-Ins'])
wb.save(file)
I want to apply the below to all sheets within the file for my script above-
column_name_update_map = {'Company Name': 'LocationName', 'Address': 'StreetAddress','City':'City','State':'State','Zip':'Zip','Phone Number':'GeneralPhone','Email':'GeneralEmail'}
df = df.rename(columns=column_name_update_map)
How do I implement this within my loop and save to file?
CodePudding user response:
Loop through each header cell, and attempt to assign a new value to each cell, pulling the value from column_name_update_map
. If the current header name (header.value
) is not in column_name_update_map
, a KeyError
will be raised that you can safely ignore. In that case, the value is not updated.
for file in os.listdir("C:/Users/yam/Desktop/hhh/python/sterls"):
if file.startswith("TVC"):
wb = load_workbook(file)
if 'Opt-Ins' in wb.sheetnames:
wb.remove(wb['Opt-Ins'])
for ws in wb:
for header in next(ws.rows):
try:
header.value = column_name_update_map[header.value]
except KeyError:
pass
wb.save(file " (updated headers).xlsx")
CodePudding user response:
you can use something like below once workbook is loaded to get sheets from the workbook and update it accordingly in loop
from openpyxl.utils.dataframe import dataframe_to_rows
sheet_names = wb.get_sheet_names()
for item in sheet_names:
current_sheet = wb.get_sheet_by_name(item)
data = current_sheet.values
columns = next(data)[0:] // gives you a first row as header
df = pd.DataFrame(data, columns=columns)
df.rename(columns=column_name_update_map)
for r in dataframe_to_rows(df, index=True, header=True):
ws.append(r)
and then save it. Hope it will help
Alternatively you can just get the first row if sheet and update it using this line
columns = next(data)[0:]