I try to sum columns, like the following:
The data frame:
ID name grade_math grade_chemistry grade_physic CS_math CS_chemistry CS_physic
1 A 4 2.75 3 3 2 3
2 B 3 4 4 3 2 3
3 C 2 2 2 3 2 3
the formula is:
df['total'] = (df['grade_math']*df['CS_math']) (df['grade_chemistry']*df['CS_chemistry']) (df['grade_physic']*df['CS_physic']
but I've tried to simplify like this:
df['total'] = sum(df[f'grade{i}'] * df[f'CS{i}'] for i in range(1, 3))
but I realized, this logic is totally wrong. Any suggestions?
CodePudding user response:
You were close in your logic. What you're after is this:
sum(df[f'grade_{subject}'] * df[f'CS_{subject}'] for subject in ["math", "chemistry", "physic"])
The issue was that when you were for i in range(1, 3)
, you were iterating over numbers. Placing them into f-strings will therefore result in strings like CS1, CS2, etc. These strings don't exist in the columns of your dataframe.
Therefore, in the provided solution you can notice that we iterate over the common suffixes ("math", "chemistry", and "physic") so that the f-strings results are found in the columns of the dataframe.
CodePudding user response:
Use:
sum(df[f'grade_{i}'] * df[f'CS_{i}'] for i in ['math', 'chemistry', 'physic'])
Output:
0 26.5
1 29.0
2 16.0
dtype: float64