AttributeError: 'list' object has no attribute 'assign'-CodePudding

I have this dataframe:

   SRC  Coup  Vint    Bal   Mar   Apr   May   Jun   Jul BondSec
0  JPM   1.5  2021   43.9   5.6   4.9   4.9   5.2   4.4    FNCL
1  JPM   1.5  2020   41.6   6.2   6.0   5.6   5.8   4.8    FNCL
2  JPM   2.0  2021  503.9   7.1   6.3   5.8   6.0   4.9    FNCL
3  JPM   2.0  2020  308.3   9.3   7.8   7.5   7.9   6.6    FNCL
4  JPM   2.5  2021  345.0   8.6   7.8   6.9   6.8   5.6    FNCL
5  JPM   4.5  2010    5.7  21.3  20.0  18.0  17.7  14.6    G2SF
6  JPM   5.0  2019    2.8  39.1  37.6  34.6  30.8  24.2    G2SF
7  JPM   5.0  2018    7.3  39.8  37.1  33.4  30.1  24.2    G2SF
8  JPM   5.0  2010    3.9  23.3  20.0  18.6  17.9  14.6    G2SF
9  JPM   5.0  2009    4.2  22.8  21.2  19.5  18.6  15.4    G2SF

I want to duplicate all the rows that have FNCL as the BondSec, and rename the value of BondSec in those new duplicate rows to FGLMC. I'm able to accomplish half of that with the following code:

if "FGLMC" not in jpm['BondSec']:
    is_FNCL = jpm['BondSec'] == "FNCL"
    FNCL_try = jpm[is_FNCL]
    jpm.append([FNCL_try]*1,ignore_index=True)

But if I instead try to implement the change to the BondSec value in the same line as below:

jpm.append(([FNCL_try]*1).assign(**{'BondSecurity': 'FGLMC'}),ignore_index=True)

I get the following error: AttributeError: 'list' object has no attribute 'assign'

Additionally, I would like to insert the duplicated rows based on an index condition, not just at the bottom as additional rows. The condition cannot be simply a row position because this will have to work on future files with different numbers of rows. So I would like to insert the duplicated rows at the position where the BondSec column values change from FNCL to FNCI (FNCI is not showing here, but basically it would be right below the last row with FNCL). I'm assuming this could be done with an np.where function call, but I'm not sure how to implement that.

I'll also eventually want to do this same exact process with rows with FNCI as the BondSec value (duplicating them and transforming the BondSec value to FGCI, and inserting at the index position right below the last row with FNCI as the value).

CodePudding user response：

I'd suggest a helper function to handle all your duplications:

def duplicate_and_rename(df, target, value):
    return pd.concat([df, df[df["BondSec"] == target].assign(BondSec=value)])

Then

for target, value in (("FNCL", "FGLMC"), ("FNCI", "FGCI")):
    df = duplicate_and_rename(df, target, value)

Then after all that, you can categorize the BondSec column and use a custom order:

ordering = ["FNCL", "FGLMC", "FNCI", "FGCI", "G2SF"]

df["BondSec"] = pd.Categorical(df["BondSec"], ordering).sort_values()
df = df.reset_index(drop=True)

Alternatively, you can use a dictionary for your ordering, as explained in this answer.

CodePudding user response：

is_FNCL = jpm['BondSecurity'] == "FNCL"
FNCL_try = jpm[is_FNCL]
FNCL_try = FNCL_try.assign(BondSecurity= 'FGLMC') #assigns 'FGLMC to BondSecurity'
df3=pd.concat([is_FNCL, FNCL_try]) #combines 2 dataframes
print(df3)