Home > Software engineering >  How do you join multiple rows into one row in pandas?
How do you join multiple rows into one row in pandas?

Time:10-01

I have a list that I'm trying to add to a dataframe. It looks something like this:

list_one = ['apple','banana','cherry',' ', 'grape', 'orange', 'pineapple','']

If I add the list to a dataframe, using df = pd.DataFrame({'list_one':list_one}) it'll look like this:

       list_one
   -------------
   0   apple
   1   banana 
   2   cherry
   3  
   4   grape
   5   orange
   6   pineapple
   7  

I want the combine some of the rows into one row, so that the dataframe looks something like this:

       list_one
   -----------------------------
   0   apple, banana, cherry 
   1   grape, orange, pineapple

Is there a simple way to do this?

Thank you for taking the time to read my question and help in any way you can.

CodePudding user response:

Create mask for match words by Series.str.contains, invert by ~ and crate groups by Series.cumsum, filter only matched rows and pass to GroupBy.agg with join function:

m = df['list_one'].str.contains('\w ')
df = df[m].groupby((~m).cumsum(), as_index=False).agg(', '.join)
print (df)
                   list_one
0     apple, banana, cherry
1  grape, orange, pineapple

CodePudding user response:

Try with groupby and agg:

>>> df.groupby(df.loc[df['list_one'].str.contains('\w ')].index.to_series().diff().ne(1).cumsum(), as_index=False).agg(', '.join)
                   list_one
0     apple, banana, cherry
1  grape, orange, pineapple
>>> 
  • Related