Id like to ask for help in fixing the missing values in pandas dataframe (python)
In this dataset I found a missing value in ['Item_Weight'] column.
I don't want to drop the missing values because I found out by sorting them. the missing value is "miss type" by someone who encoded it.
Now I created a lookup dataset so I can merge them to fill na missing values.
How can I merge them or join them only to fill the missing values (Nan) using the lookup table I made? Or is there any other way without using a lookup table?
CodePudding user response:
Looking at this you will probably want to use something along the lines of map
instead of join/merge
this is an example of how to use map with your data.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Column1' : ['A', 'B', 'C'],
'Column2' : [1, np.nan, 3]
})
df
df_map = pd.DataFrame({
'Column1' : ['A', 'B', 'C'],
'Column2' : [1, 2, 3]
})
df_map
#Looks to find where the column you specify is null, then using your map df will map the value from column1 to column2
df['Column2'] = np.where(df['Column2'].isna(), df['Column1'].map(df_map.set_index('Column1')['Column2']), df['Column2'])
I had to create my own dataframes since you used screenshots. In the future, the use of screenshots is not considered best to help developers with assistance.
CodePudding user response:
This will probably work:
df = df.sort_values(['Item_Identifier', 'Item_Weight']).ffill()
But I can't test it since you didn't give us anything to work with.