I have the below table:
df = pd.DataFrame({'Student ID': [123456,789456,101112,131415],
'Math ': [3,2,4,2],
'Art ': [3,3,3,1]})
I want to look into the combination of the exams and label each student as follows:
incase the student got:
- 1 in Math and 2 in Art (or vise versa) then labelled "Usatisfactory"
- 3 in Math and 3 in Art (or vise versa) then labelled "Good"
- 3 in Math and 4 in Art (or vise versa) then labelled "Good"
- 2 in Math and 3 in Art (or vise versa) then labelled "Moderate"
Student ID | Math | Art | combined |
---|---|---|---|
123456 | 3 | 3 | 33 |
789456 | 2 | 3 | 23 |
101112 | 4 | 3 | 43 |
131415 | 2 | 1 | 21 |
What I did, is created the combined column using the following code:
Create a new column and concatenate data
cols = ['Math', 'Art',]
df['combined'] = df[cols].apply(lambda x: ''.join(x.values.astype(str)), axis=1)
convert 'combined' column to integer
df['combined'] = df['combined'] .astype('int')
Create a combination for score
unsat =[11]
moderate =[12,21,22]
good =[33,34,43]
excellent=[45,54,55]
so if the score is 11 or (1 for math and 1 for Art) he/she will get unsatisfactory if the student got any combination of 33 or 34 then moderate and so on.
I am unable to know how to use my logic to label my data. I feel that my way is way too complex
CodePudding user response:
Since you allow vice verse, math 1 art 2 is equal to math 2 art 1, which means you can simply use sum
:
mapper = {3: "Unsatisfactory",
6: "Good",
7: "Good",
5: "Moderate"}
df["label"] = df["Math"].add(df["Art"]).replace(mapper)
print(df)
Output:
Student ID Math Art label
0 123456 3 3 Good
1 789456 2 3 Moderate
2 101112 4 3 Good
3 131415 2 1 Unsatisfactory