combining specific row conditionally and add output to existing row in pandas-CodePudding

suppose I have following data frame :

data = {'age' :[10,11,12,11,11,10,11,13,13,13,14,14,15,15,15],
        'num1':[10,11,12,13,14,15,16,17,18,19,20,21,22,23,24],
        'num2':[20,21,22,23,24,25,26,27,28,29,30,31,32,33,34]}
df = pd.DataFrame(data)

I want to sum rows for age 14 and 15 and keep those new values as age 14. my expected output would be like this:

    age time1 time2
1   10    10    20
2   11    11    21
3   12    12    22
4   11    13    23
5   11    14    24
6   10    15    25
7   11    16    26
8   13    17    27
9   13    18    28
10  13    19    29
11  14   110   160

in the code below, I have tried to group.by age but it does not work for me:

df1 =df.groupby(age[age >=14])['num1', 'num2'].apply(', '.join).reset_index(drop=True).to_frame()

CodePudding user response：

limit_age = 14
new = df.query("age < @limit_age").copy()
new.loc[len(new)] = [limit_age,
                     *df.query("age >= @limit_age").drop(columns="age").sum()]

first get the "before 14" dataframe
then assign it to a new row where
- age is 14
- other values are the row-wise sums of "after 14" dataframe

to get

>>> new

    age  num1  num2
0    10    10    20
1    11    11    21
2    12    12    22
3    11    13    23
4    11    14    24
5    10    15    25
6    11    16    26
7    13    17    27
8    13    18    28
9    13    19    29
10   14   110   160

(new.index = 1 can be used for a 1-based index at the end.)

CodePudding user response：

I would use a mask and concat:

m = df['age'].isin([14, 15])

out = pd.concat([df[~m],
                 df[m].agg({'age': 'min', 'num1': 'sum', 'num2': 'sum'})
                      .to_frame().T
                ], ignore_index=True)

Output:

    age  num1  num2
0    10    10    20
1    11    11    21
2    12    12    22
3    11    13    23
4    11    14    24
5    10    15    25
6    11    16    26
7    13    17    27
8    13    18    28
9    13    19    29
10   14   110   160