I have a dataframe:
id type value
1 inner Upload new model. \nUpdate data.
2 outer Create new task.
I want to split rows which have \n in column value by splitting text in it and placing in new row. So desired result is:
id type value
1 inner Upload new model.
1 inner Update data.
2 outer Create new task.
This dataframe is for example, real one is much bigger, so I need to write a function to apply to dataframe. How could I do that?
CodePudding user response:
You can do the following thing:
df['value'] = df['value'].replace(r'\\n', '\\n ', regex=True)
which puts a spece between \n
and the next word. Then
(df.set_index(['id', 'type'])
.apply(lambda x: x.str.split('\\n ').explode())
.reset_index())
which gives
id type value
0 1 inner Upload new model.
1 1 inner Update data.
2 2 outer Create new task.
CodePudding user response:
maybe this will help:
>>> df.assign(value=df['value'].str.split('\n')).explode('value')
'''
id type value
0 1 inner Upload new model.
0 1 inner Update data.
1 2 outer Create new task.