Home > Back-end >  Split List of Sentences in a Pandas Dataframe Column into Separate Columns
Split List of Sentences in a Pandas Dataframe Column into Separate Columns

Time:10-01

I have a pandas dataframe with one column of sentences in a list, a sample is below:

import pandas as pd

d ={1: "['f, they have everything i am looking for.', 'd has a lot of diffrent options, and they carry every size needed.', 'q, i always find what I am looking for']", 
     2: "['easy to navigate', 'fast and easy to use. very helpful when needed, would recommend! will definitely use in future', 'easy to use, very convenient']"
   }

s = pd.Series(d)

What I would like to do is split each of the sentences in the list, there are three sentences per list, into individual columns like below

d2 = [['f, they have everything i am looking for.', 'd has a lot of different options, and they carry every size needed.', 'q, i always find what I am looking for'], ['easy to navigate', 'fast and easy to use. very helpful when needed, would recommend! will definitely use in future', 'easy to use, very convenient']]
   
df = pd.DataFrame(d2, columns=['rep1', 'rep2', 'rep3'])
df

My attempts at using Series.str.split() have been unsuccessful.

CodePudding user response:

You could use split etc. but it’s not very robust:

>>> s.str[2:-2].str.split("',\s*'", expand=True).add_prefix('rep')
                                        rep0  ...                                    rep2
1  f, they have everything i am looking for.  ...  q, i always find what I am looking for
2                           easy to navigate  ...            easy to use, very convenient

[2 rows x 3 columns]

The robust way to do it is ast.literal_eval, and then some pivoting:

>>> df = s.apply(ast.literal_eval).explode().rename('val').reset_index()
>>> df = df.join(df.groupby('index').cumcount().rename('rep').add(1)).pivot('index', 'rep', 'val').add_prefix('rep')
>>> df
rep                                         rep1  ...                                    rep3
index                                             ...                                        
1      f, they have everything i am looking for.  ...  q, i always find what I am looking for
2                               easy to navigate  ...            easy to use, very convenient

[2 rows x 3 columns]
  • Related