Home > other >  How to compare the content of the data columns with the list in pandas
How to compare the content of the data columns with the list in pandas

Time:10-22

I have a data frame as follows:

col1 col2 col3
A E C
A D
D B
A D E
A C

And list answer_key = ["A", "B", "C"].

I want to compare the values of each column to the list's value in sequence.

Returns the score based on the following rule: no responses = 0, successfully answered = 5, incorrectly answered = -5. Also, please return the entire score.

CodePudding user response:

Try:

answer_key = np.array(["A", "B", "C"])
df['score'] = df.apply(lambda x: ((x.to_numpy() == answer_key).sum())*5, axis=1)

OUTPUT:

  col1 col2 col3  score
0    A    E    C     10
1    A  NaN    D      5
2    D    B  NaN      5
3    A    D    E      5
4    A  NaN    C     10

CodePudding user response:

This sounds like a homework question, so I will only provide you with psuedocode to help point you in the correct direction. I am also assuming that you are looking to compare the contents in each collumn to your answer_key and that these wont be dynamically added to.

# Create a list with your keys (you already did this)

# Create three seperate list for each collumn (col1, col2, col3)
# Also use something as a default value for values that are empty
# Ex1: col2 = ['E', None, 'B']
# Ex2: col2 = ['E', '0', 'B'] - either of these methods could work

# Create a dictionary to reference these list
cols = [0 : col1, 1 : col2, 2 : col3]

# Create an variable to store the entire score
score = 0

# Use nested loops to iterate through each collumn & each value
# example
for i in range(3):

    # temporarily cache a list object for referrence
    curList = cols.get(i)

    # Compare contents of the key and list
    for c in range(len(answer_key)):

        # If curList[c] == None (or whatever value you
        # are using for null) then score  = 0

        # If answer_key[c] == curList[c] then score  = 5
        # Else if answer_key[c] != curList[c] then score -= 5
  • Related