I have a dataframe with one column "text":
text
I love cakes we should make them
Joe is very late will there be photography?
you should wright code correctly it is very important
I want to explode those rows in cases where there are 2 or more spaces between texts. So desired output is:
text
I love cakes
we should make them
Joe is very late
will there be photography?
you should wright code correctly
it is very important
I know that I can do: df["text"].apply(lambda x: x.split(" "))
but I don't want to specify in split each number of spaces (df["text"].apply(lambda x: x.split(" ")), df["text"].apply(lambda x: x.split(" ")), df["text"].apply(lambda x: x.split(" ")), .....
. i want 2 spaces condition. how could I do that?
CodePudding user response:
You can split by regex and than explode
the column
df = df['text'].str.split(r'\s{2,}').explode().reset_index().drop("index", 1)
Output
text
0 I love cakes
1 we should make them
2 Joe is very late
3 will there be photography?
4 you should wright code correctly
5 it is very important