I have a list that I'm trying to add to a dataframe. It looks something like this:
list_one = ['apple','banana','cherry',' ', 'grape', 'orange', 'pineapple','']
If I add the list to a dataframe, using df = pd.DataFrame({'list_one':list_one})
it'll look like this:
list_one
-------------
0 apple
1 banana
2 cherry
3
4 grape
5 orange
6 pineapple
7
I want the combine some of the rows into one row, so that the dataframe looks something like this:
list_one
-----------------------------
0 apple, banana, cherry
1 grape, orange, pineapple
Is there a simple way to do this?
Thank you for taking the time to read my question and help in any way you can.
CodePudding user response:
Create mask for match words by Series.str.contains
, invert by ~
and crate groups by Series.cumsum
, filter only matched rows and pass to GroupBy.agg
with join
function:
m = df['list_one'].str.contains('\w ')
df = df[m].groupby((~m).cumsum(), as_index=False).agg(', '.join)
print (df)
list_one
0 apple, banana, cherry
1 grape, orange, pineapple
CodePudding user response:
Try with groupby
and agg
:
>>> df.groupby(df.loc[df['list_one'].str.contains('\w ')].index.to_series().diff().ne(1).cumsum(), as_index=False).agg(', '.join)
list_one
0 apple, banana, cherry
1 grape, orange, pineapple
>>>