I have a dataframe as follows,
import pandas as pd
import numpy as np
df= pd.DataFrame({"text['open','the','door','val','close','the','door','val'],"label":['O','B','D',None,'C','E','N',None]})
I would like to add a row after every where the column label has a none value, so I did the following, but I get a key value error for the last index in the datframe.
df2= np.where(df.label== None, df.loc[len(df)]==['new_val','new_val'], df)
print(df2)
the error is,
raise KeyError(key) from err
KeyError: 8
my desired output is,
text label
0 open O
1 the B
2 door D
3 val None
4 new_val new_val
5 close C
6 the E
7 door N
8 val None
9 new_val new_val
CodePudding user response:
Use concat
by helper DataFrame filtered by None
or misisng values by Series.isna
, set values in columns in DataFrame.assign
and then sort index by DataFrame.sort_index
with created default indices:
df = (pd.concat([df, df[df.label.isna()].assign(text='new_val',label='new_val')])
.sort_index()
.reset_index(drop=True))
print (df)
text label
0 open O
1 the B
2 door D
3 val None
4 new_val new_val
5 close C
6 the E
7 door N
8 val None
9 new_val new_val