How to split a row on ‘\n’ and add everything after the split to a new row?-CodePudding

I have a dataframe:

id    type      value
1    inner      Upload new model. \nUpdate data. 
2    outer      Create new task.

I want to split rows which have \n in column value by splitting text in it and placing in new row. So desired result is:

id    type      value
1    inner      Upload new model.
1    inner      Update data. 
2    outer      Create new task.

This dataframe is for example, real one is much bigger, so I need to write a function to apply to dataframe. How could I do that?

CodePudding user response：

You can do the following thing:

df['value'] = df['value'].replace(r'\\n', '\\n ', regex=True)

which puts a spece between \n and the next word. Then

(df.set_index(['id', 'type'])
   .apply(lambda x: x.str.split('\\n ').explode())
   .reset_index())

which gives

   id   type               value
0   1  inner  Upload new model. 
1   1  inner       Update data. 
2   2  outer    Create new task.

CodePudding user response：

maybe this will help:

>>> df.assign(value=df['value'].str.split('\n')).explode('value')
'''
   id   type               value
0   1  inner  Upload new model. 
0   1  inner        Update data.
1   2  outer    Create new task.