I wonder how can I read an Excel file using Python (xlsx extension), then modify only rows 4,5,6,7,... and save modified values in such format with the same header (yellow rows, picture, name of the columns - Description, Part number,...) and fixed columns width.
When modifying the rows 4,5,6,7... I must use data frames. Now I'm doing this in this way:
- Get the data from XML
- Save the data to data frame.
- Save df in file .csv.
def get_data_from_xml(path):
#print(path)
tree = ET.parse(path)
root = tree.getroot()
for Type in root.iter('Type'):
PN = Type.get('name')
#print(f"Part number: {PN}")
DS = Type.get('Desc')
#print(f"Description: {DS}")
for Component in root.iter('Component'):
CName = Component.get('name')
if CName == 'Pos010_GearHousing':
for ComponentNo in Component.iter('ComponentNo'):
GH = ComponentNo.get('name')
#print(f"Gear Housing: {GH}")
if CName == 'Pos058_PowerPack':
for ComponentNo in Component.iter('ComponentNo'):
PP = ComponentNo.get('name')
#print(f"Power Pack: {PP}")
if CName == 'Pos082_PedalInterface':
for ComponentNo in Component.iter('ComponentNo'):
PI = ComponentNo.get('name')
#print(f"Pedal Interface: {PI}")
#print(PN, GH, PP, DS, PI)
return PN, GH, PP, DS, PI
pn_list = []
gh_list = []
pp_list = []
ds_list = []
pi_list = []
path = 'C:/Users/STJ2TW/Desktop/Files XML/'
for filename in os.listdir(path):
if '.' not in filename:
fullname = os.path.join(path, filename)
#print(fullname)
PN, GH, PP, DS, PI = get_data_from_xml(fullname)
#print(pn, gh, pp)
pn_list.append(PN)
gh_list.append(GH)
pp_list.append(PP)
ds_list.append(DS)
pi_list.append(PI)
df = pd.DataFrame(
{'Description': ds_list,
'Part number': pn_list,
'Gear Housing': gh_list,
'Power Pack': pp_list,
'Pedal Interface': pi_list,
})
df = df.set_index('Description', drop = True)
df.to_csv("C:/Users/STJ2TW/Desktop/Files CSV/file.csv", sep=";")
df.head(10)
But instead of .csv file I should get the .xmls file with modified values.
To sum up: First, I should read the XLSX file with formatting and header. Then read the values from another file - XML, put them into data frame and finally save data frame into this XLSX formatted file (on the picture).
How should I do this?
Thanks in advance for any help.
CodePudding user response:
You can use openpyxl to modify existing .xlsx files. You can find a lot of information about this package online, and you can see an example here.