I have a table like this:
id | q1_score | q2_score | q3_score | q4_score | quarter |
---|---|---|---|---|---|
1 | -0.77 | -0.55 | 0.21 | 3.42 | q1 |
2 | -0.77 | -0.55 | 0.21 | 3.42 | q1 |
3 | -0.77 | -0.55 | 0.21 | 3.42 | q2 |
4 | -0.77 | -0.55 | 0.21 | 3.42 | q3 |
5 | -0.77 | -0.55 | 0.21 | 3.42 | q4 |
I want to create a new column quarter_score
based on the value of quarter
.
- If
quarter==q1
thenquarter_score
=value fromq1_score
of that row. - If
quarter==q2
thenquarter_score
=value fromq2_score
of that row. - so forth ...
My code looks like this
mapper = {'q1': df['q1_score'], 'q2':df['q2_score'], 'q3': df['q3_score'], 'q4': df['q4_score']}
df['quarter_score'] = df['quarter'].map(mapper)
but it does not take single value, it takes whole rows for that column for each row.
id | q1_score | q2_score | q3_score | q4_score | quarter | quarter_score |
---|---|---|---|---|---|---|
1 | -0.77 | -0.55 | 0.21 | 3.42 | q1 | 0 -0.77 1 -0.77 2 -0.77 |
2 | -0.77 | -0.55 | 0.21 | 3.42 | q1 | 0 -0.77 1 -0.77 2 -0.77 |
3 | -0.77 | -0.55 | 0.21 | 3.42 | q2 | 0 -0.55 1 -0.55 2 -0.55 |
4 | -0.77 | -0.55 | 0.21 | 3.42 | q3 | 0 0.21 1 0.21 2 0.21 |
5 | -0.77 | -0.55 | 0.21 | 3.42 | q4 | 0 3.42 1 3.42 2 3.42 |
My final output should be like this below.
id | q1_score | q2_score | q3_score | q4_score | quarter | quarter_score |
---|---|---|---|---|---|---|
1 | -0.77 | -0.55 | 0.21 | 3.42 | q1 | -0.77 |
2 | -0.77 | -0.55 | 0.21 | 3.42 | q1 | -0.77 |
3 | -0.77 | -0.55 | 0.21 | 3.42 | q2 | -0.55 |
4 | -0.77 | -0.55 | 0.21 | 3.42 | q3 | 0.21 |
5 | -0.77 | -0.55 | 0.21 | 3.42 | q4 | 3.42 |
CodePudding user response:
Try:
df["quarter_score"] = df.apply(lambda x: x[x["quarter"] "_score"], axis=1)
print(df)
Prints:
id q1_score q2_score q3_score q4_score quarter quarter_score
0 1 -0.77 -0.55 0.21 3.42 q1 -0.77
1 2 -0.77 -0.55 0.21 3.42 q1 -0.77
2 3 -0.77 -0.55 0.21 3.42 q2 -0.55
3 4 -0.77 -0.55 0.21 3.42 q3 0.21
4 5 -0.77 -0.55 0.21 3.42 q4 3.42
CodePudding user response:
For huge datasets, going down to numpy's level can be interesting
df['quarter_score'] = df.to_numpy()[np.arange(len(df)),
df['quarter'].str[-1].astype(int) - 1]