Home > OS >  Python pandas concat for multipleobjects
Python pandas concat for multipleobjects

Time:05-31

Im extracting data from excel with multiple sheet, I used for loop for reading the sheet because I need to attached the sheet name every rows. My problem is every sheet that extracted my code create a new dataframe not append in last index, how to merge or concat into one?

Current output look like:

    Control No.:   Unnamed: 1      Unnamed: 2 
0       127            test             NaN
1       458            test             NaN
2       278            test             NaN
3       2453           test             NaN


    Control No.:   Unnamed: 1      Unnamed: 2 
0       573            sample        NaN
1       782            sample        NaN
2       222            sample        NaN
3       257            sample        NaN

I use pd.concat and pd.merge ignoring the index but still not working, also excel have more than 7 sheets.

Im started with this code:

xls = pd.ExcelFile('file_loc')
sheets = [sheet for sheet in xls.sheet_names]
for sheet_name in sheets:
      df = pd.read_excel('file_loc', sheet_name=sheet_name)
      df1 = pd.concat([df], ignore_index=True, sort=False)
      print(df1)

My needed output is like this:

    Control No.:   Unnamed: 1      Unnamed: 2 
0       127            test             NaN
1       458            test             NaN
2       278            test             NaN
3       2453           test             NaN
4       573            sample        NaN
5       782            sample        NaN
6       222            sample        NaN
7       257            sample        NaN

CodePudding user response:

Try using below code.

  • we create the empty list for storing the dataframe in each iteration.
  • In loop, we read the df from excel file and append the dataframe to the list
  • After loop, we concat all dataframes in list.
xls = pd.ExcelFile('file_loc')
sheets = [sheet for sheet in xls.sheet_names]
list_df = []
for sheet_name in sheets:
      df = pd.read_excel('file_loc', sheet_name=sheet_name)
      list_df  = [df]

df = pd.concat(df).reset_index(drop=True)
  • Related