Concatenating two dataframes resulting in empty dataframe-CodePudding

I have dataB which I want to append to my dataframe dataA when a certain condition is fulfilled. This should be really easy but I'm getting an empty result at the end (I can make it work with 'append' but I read it's deprecated. Below is my code:

-I read dataB from file -create dataA dataframe with same keys -Loop through dataB and when condition is met, I get the row and 'append' it to dataA. Using concat, this does not work.

dataB= pd.read_excel("Data.xlsx")
keys = bondData.keys()
dataA= pd.DataFrame(columns=keys)



for i in range(0,len(dataB)):
    if dataB['Coupon'][i] == 0:
        pd.concat([dataA,dataB.iloc[i]])

I get: Empty DataFrame Columns: [Ttm, Cusip, ISIN, Full_ISIN, Coupon, Issue Date, Des, YTM] Index: []

CodePudding user response：

It appears that you are just trying to filter dataB only keeping the rows where Coupon is zero. You do not use a loop for that. Instead, use any of the selection methods pandas provides, e.g. loc, query

For example: dataA = dataB.query("Coupon == 0")

Using your original approach, in addition to the missing assignment to dataA as pointed out in Alonso's answer, you need to make sure that you concatenate DataFrames in order to preserve the columns. dataB.iloc[i] selects a single row as a series. You can use dataB.iloc[[i]] to select a single row as a dataframe:

import pandas as pd

dataB = pd.DataFrame({'Coupon': [0, 0, 0, 1, 1, 1],
                      'year': [2003,2004,2005, 2003, 2004, 2004]})

keys = dataB.keys()
dataA= pd.DataFrame(columns=keys)

for i in range(0,len(dataB)):
    if dataB['Coupon'][i] == 0:
        dataA = pd.concat([dataA, dataB.iloc[[i]]])

print(dataA)

#result:
  Coupon  year
0      0  2003
1      0  2004
2      0  2005