Home > Enterprise >  How to simplify pandas columns sum?
How to simplify pandas columns sum?

Time:06-19

I try to sum columns, like the following:

The data frame:

ID   name   grade_math   grade_chemistry   grade_physic   CS_math   CS_chemistry   CS_physic
1     A          4              2.75             3           3            2             3
2     B          3               4               4           3            2             3
3     C          2               2               2           3            2             3

the formula is:

df['total'] = (df['grade_math']*df['CS_math'])   (df['grade_chemistry']*df['CS_chemistry'])   (df['grade_physic']*df['CS_physic']

but I've tried to simplify like this:

df['total'] = sum(df[f'grade{i}'] * df[f'CS{i}'] for i in range(1, 3))

but I realized, this logic is totally wrong. Any suggestions?

CodePudding user response:

You were close in your logic. What you're after is this:

sum(df[f'grade_{subject}'] * df[f'CS_{subject}'] for subject in ["math", "chemistry", "physic"])

The issue was that when you were for i in range(1, 3), you were iterating over numbers. Placing them into f-strings will therefore result in strings like CS1, CS2, etc. These strings don't exist in the columns of your dataframe.

Therefore, in the provided solution you can notice that we iterate over the common suffixes ("math", "chemistry", and "physic") so that the f-strings results are found in the columns of the dataframe.

CodePudding user response:

Use:

sum(df[f'grade_{i}'] * df[f'CS_{i}'] for i in ['math', 'chemistry', 'physic'])

Output:

0    26.5
1    29.0
2    16.0
dtype: float64
  • Related