Home > Software design >  Pandas: collapse rows with same column ID and concatenate multiple columns into 1
Pandas: collapse rows with same column ID and concatenate multiple columns into 1

Time:12-29

Without iterating over the dataframe, is it possible using pandas methods to "collapse" from this

index    ID     fruits    qty
0        101    oranges   1
1        134    apples    2
2        576    pears     3
3        837    apples    1
4        576    oranges   2
5        134    pears     3

to this (and generate a new description column in the process that sort of summarizes the info into a single string):

index    ID     description
0        101    101: orange(1)
1        134    134: apples(2) pears(3)
2        576    576: pears(3) oranges(2)
3        837    837: apples(1)

Note that this is a spin on this question.

CodePudding user response:

First join columns fruits with qty, then aggregate join and last add ID converted to strings:

df['description'] = df['fruits']   '('   df['qty'].astype(str)   ')'

df = df.groupby('ID')['description'].agg(' '.join).reset_index()
df['description'] = df['ID'].astype(str)    ': '   df['description'] 
print (df)
    ID               description
0  101           101: oranges(1)
1  134   134: apples(2) pears(3)
2  576  576: pears(3) oranges(2)
3  837            837: apples(1)

Another idea with GroupBy.apply:

def f(x):
    d = x['fruits']   '('   x['qty'].astype(str)   ')'
    return x.name.astype(str)    ': '   ' '.join(d)

df = df.groupby('ID').apply(f).reset_index(name='description')

print (df)
    ID               description
0  101           101: oranges(1)
1  134   134: apples(2) pears(3)
2  576  576: pears(3) oranges(2)
3  837            837: apples(1)

CodePudding user response:

Using Pandas there is the apply function. Using this function you can apply lambda functions to rows, columns, or even whole dataframes.

Here is the documentation link to the function. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.apply.html

  • Related