currently, I have a dataframe that looks like this
Heading 1 | Heading 2 |
---|---|
01 | 3G |
02 | 94 |
03 | 78 |
04 | 3L |
The Heading 2 values are wrong and suppose to be 113G, 0994, 0978, and 113N.
My attempt at this looks like
Correct_List = [113G, 0994, 0978, 113N]
Incorrect_List = [94, 3G, 3L, 78]
def correcting(df, list_1, list_2):
for (two_digit, four_digit) in zip(list_1, list_2) :
if two_digit == four_digit[-2:]:
df['DDIST'] = np.where((df.DDIST == two_digit), four_digit, df.DDIST)
return df
I ran the function and checked the data frame, but nothing happened.
The desired output is:
Heading 1 | Heading 2 |
---|---|
01 | 113G |
02 | 0994 |
03 | 0978 |
04 | 113L |
Also, any method better than the loop would also be appreciated, since the actual data frame and loop are very big
CodePudding user response:
You can create a dictionary of key, value pairs from your incorrect and correct lists, then call replace
passing the dictionary for the column.
>>> mapping={k:v for k,v in zip(Incorrect_List, Correct_List)}
>>> df['Heading 2']=df['Heading 2'].replace(mapping)
df
Heading 1 Heading 2
0 01 0994
1 02 113G
2 03 113N
3 04 0978
CodePudding user response:
in the provided sample, the last value should be 3N, instead of 3L.
create a dictionary from the correct values, taking last two as key and then mapping in the DF
d= {i[2:]:i for i in Correct_List}
df['Heading 2']=df['Heading 2'].map(d)
df
Heading 1 Heading 2
0 1 113G
1 2 0994
2 3 0978
3 4 113N
CodePudding user response:
Do you have all the correct values in a list? If yes, you could just reassign all the values in the DF column like this:
df['Heading 2'] = Correct_List
But if you need some custom logic, rather than looping through the data frame, convert the column into a list, apply your logic and then reassign the values in the DF.
x = df['Heading 2'].tolist()
# perform whatever logic you need
df['Heading 2'] = x