Home > Back-end >  How to edit dataframe to correct values, preferably without a loop?
How to edit dataframe to correct values, preferably without a loop?

Time:09-20

currently, I have a dataframe that looks like this

Heading 1 Heading 2
01 3G
02 94
03 78
04 3L

The Heading 2 values are wrong and suppose to be 113G, 0994, 0978, and 113N.

My attempt at this looks like

Correct_List = [113G, 0994, 0978, 113N]
Incorrect_List = [94, 3G, 3L, 78]
    
def correcting(df, list_1, list_2):
    
   for (two_digit, four_digit) in zip(list_1, list_2) :
      if two_digit == four_digit[-2:]:
           df['DDIST'] = np.where((df.DDIST == two_digit), four_digit, df.DDIST)
   return df

I ran the function and checked the data frame, but nothing happened.

The desired output is:

Heading 1 Heading 2
01 113G
02 0994
03 0978
04 113L

Also, any method better than the loop would also be appreciated, since the actual data frame and loop are very big

CodePudding user response:

You can create a dictionary of key, value pairs from your incorrect and correct lists, then call replace passing the dictionary for the column.

>>> mapping={k:v for k,v in zip(Incorrect_List, Correct_List)}
>>> df['Heading 2']=df['Heading 2'].replace(mapping)
df
  Heading 1 Heading 2
0        01      0994
1        02      113G
2        03      113N
3        04      0978

CodePudding user response:

in the provided sample, the last value should be 3N, instead of 3L.

create a dictionary from the correct values, taking last two as key and then mapping in the DF

d= {i[2:]:i for i in Correct_List}
df['Heading 2']=df['Heading 2'].map(d)
df
    Heading 1   Heading 2
0           1     113G
1           2     0994
2           3     0978
3           4     113N

CodePudding user response:

Do you have all the correct values in a list? If yes, you could just reassign all the values in the DF column like this:

df['Heading 2'] = Correct_List

But if you need some custom logic, rather than looping through the data frame, convert the column into a list, apply your logic and then reassign the values in the DF.

x = df['Heading 2'].tolist()
# perform whatever logic you need
df['Heading 2'] = x  
  • Related