Home > Software engineering >  Pandas Merge Cells in First Column with Same Values
Pandas Merge Cells in First Column with Same Values

Time:11-23

I want to merge the consecutive values in the first column of an Excel file and export it to another one. My question is pretty similar to this one but I can't get the correct output file.

Input Excel file (Modules.xlsx)

Input Excel file

data = pd.read_excel(io="Modules.xlsx")
df = pd.DataFrame(data=data).set_index([data.columns[0]])
print(df)
with pd.ExcelWriter(path="excel_file.xlsx", engine="xlsxwriter") as writer:
    df.to_excel(excel_writer=writer, sheet_name="Inventories")
    old_ws = writer.sheets.get("Inventories")
    for col, val in enumerate(df.reset_index().columns):
        old_ws.write(0, col, val)
                                                   Module Name Serial Number       PID                                 Description
MGMT IP Address (Hostname)
sandbox-iosxe-latest-1.cisco.com (csr1000v-1)          Chassis   9ESGOBARV9D  CSR1000V                      Cisco CSR1000V Chassis
sandbox-iosxe-latest-1.cisco.com (csr1000v-1)        module R0   JAB1303001C  CSR1000V              Cisco CSR1000V Route Processor
sandbox-iosxe-latest-1.cisco.com (csr1000v-1)        module F0           NaN  CSR1000V  Cisco CSR1000V Embedded Services Processor
sandbox-iosxe-recomm-1.cisco.com (csr1000v-recomm)     Chassis   926V75BDNRJ  CSR1000V                      Cisco CSR1000V Chassis
sandbox-iosxe-recomm-1.cisco.com (csr1000v-recomm)   module R0   JAB1303001C  CSR1000V              Cisco CSR1000V Route Processor
sandbox-iosxe-recomm-1.cisco.com (csr1000v-recomm)   module F0           NaN  CSR1000V  Cisco CSR1000V Embedded Services Processor

The output excel_file.xlsx is exactly the same as Modules.xlsx. What am I missing to get the excel_file.xlsx to look like the image below?

Pandas v1.3.4 & xlsxwriter v3.0.2

Excel file

CodePudding user response:

First, what is "d" in df = pd.DataFrame(data=data).set_index([d.columns[0]]) ?

From the accepted answer to the question you linked I take that the Index must be multilevel (more than 1 index)

So you would have ...set_index(["MGMT IP Address (Hostname)", "Module Name"])

Without having your data I can't check that though.

Maybe this simple example holds true for your data as well:

import pandas as pd

data = {"A": ["a", "a", "b" ,"c" ,"d" ], "B": [2, 2, 2, 2, 1], "C":[1, 2, 3, 5, 6]}
df1 = pd.DataFrame(data=data).set_index(["A"])
df1
   B  C
A      
a  2  1
a  2  2
b  2  3
c  2  5
d  1  6

df2 = pd.DataFrame(data=data).set_index(["A","B"])
df2

     C
A B   
a 2  1
  2  2
b 2  3
c 2  5
d 1  6
  • Related