I have this data in which I have a column that indicates a color and another one that indicates a letter. If the color and the letter 'belong' together, then the data is correct so a new column should state a C. Otherwise, it should state an I.
I did it like this but the thing is, this only puts all the correct ones at the top and the incorrect ones at the bottom:
#correct
c1 = df['color'].eq('green') & df['value'].eq('V')
c2 = df['color'].eq('blue') & df['value'].eq('A')
c3 = df['color'].eq('red') & df['value'].eq('R')
m = c1 | c2 | c3
correct_df = df.loc[m, ['Person ID','word', 'rt', 'color']]
correct_df['accuracy'] = 'C'
incorrect_df = df.loc[~m, ['word', 'rt', 'color']]
incorrect_df['accuracy'] = 'I'
df_cor_inc = correct_df.append([incorrect_df])
What I need is to have instead the other column just be added to the side and say whether the response was correct or not but in the order the data is already in.
This is a sample of the data:
Person ID value word color correct rt
0 R FLOWER red r 1223
0 B CAR blue b 33
1 G KNIFE blue b 333
1 R CAT red r 2332
2 B CHILD green g 232
This is how I want it to look:
Person ID value word color correct rt accuracy
0 R FLOWER red r 1223 C
0 B CAR blue b 33 C
1 G KNIFE blue b 333 I
1 R CAT red r 2332 C
2 B CHILD green g 232 I
CodePudding user response:
Reusing your boolean mask m
, we can use np.where()
as follows:
df['accuracy'] = np.where(m, 'C', 'I')
np.where()
acts like an if-then-else statement. If the condition in first parameter is True, it will set value according to the second parameter ('C'
here); Else, it will set value according to the third parameter ('I'
here).
Result:
print(df)
Person ID value word color correct rt accuracy
0 0 R FLOWER red r 1223 C
1 0 B CAR blue b 33 I
2 1 G KNIFE blue b 333 I
3 1 R CAT red r 2332 C
4 2 B CHILD green g 232 I