I have a data frame as follows:
col1 | col2 | col3 |
---|---|---|
A | E | C |
A | D | |
D | B | |
A | D | E |
A | C |
And list answer_key = ["A", "B", "C"]
.
I want to compare the values of each column to the list's value in sequence.
Returns the score based on the following rule: no responses = 0, successfully answered = 5, incorrectly answered = -5. Also, please return the entire score.
CodePudding user response:
Try:
answer_key = np.array(["A", "B", "C"])
df['score'] = df.apply(lambda x: ((x.to_numpy() == answer_key).sum())*5, axis=1)
OUTPUT:
col1 col2 col3 score
0 A E C 10
1 A NaN D 5
2 D B NaN 5
3 A D E 5
4 A NaN C 10
CodePudding user response:
This sounds like a homework question, so I will only provide you with psuedocode to help point you in the correct direction. I am also assuming that you are looking to compare the contents in each collumn to your answer_key and that these wont be dynamically added to.
# Create a list with your keys (you already did this)
# Create three seperate list for each collumn (col1, col2, col3)
# Also use something as a default value for values that are empty
# Ex1: col2 = ['E', None, 'B']
# Ex2: col2 = ['E', '0', 'B'] - either of these methods could work
# Create a dictionary to reference these list
cols = [0 : col1, 1 : col2, 2 : col3]
# Create an variable to store the entire score
score = 0
# Use nested loops to iterate through each collumn & each value
# example
for i in range(3):
# temporarily cache a list object for referrence
curList = cols.get(i)
# Compare contents of the key and list
for c in range(len(answer_key)):
# If curList[c] == None (or whatever value you
# are using for null) then score = 0
# If answer_key[c] == curList[c] then score = 5
# Else if answer_key[c] != curList[c] then score -= 5