I have a pandas dataframe that is quite similar to this:-
name | status |
---|---|
eric | single |
. | 0 |
xavier | couple |
sarah | couple |
. | 0 |
aaron | divorced |
. | 0 |
I would like to add a new row after every period as below:-
name | status |
---|---|
eric | single |
. | 0 |
xavier | couple |
sarah | couple |
. | 0 |
aaron | divorced |
. | 0 |
Appreciate any guidance on this!
CodePudding user response:
You can use groupby
and apply a concatenation of each group to a dummy row:
(df
.groupby(df['name'].shift().eq('.').cumsum(), group_keys=False)
.apply(lambda g: pd.concat([g, pd.DataFrame(index=[0], columns=g.columns)]).fillna(''))
)
output:
name status
0 eric single
1 . 0
0
2 xavier couple
3 sarah couple
4 . 0
0
5 aaron divorced
6 . 0
0
Or extract the rows with .
and concat:
df2 = df[df['name'].eq('.')].copy()
df2.loc[:] = ''
pd.concat([df, df2]).sort_index(kind='stable')
output:
name status
0 eric single
1 . 0
1
2 xavier couple
3 sarah couple
4 . 0
4
5 aaron divorced
6 . 0
6