I am currently trying to fill blanks in a data frame that looks like the following:
AL|ATFC|Year Latitude Longitude
0 AL011851 NaN NaN
1 NaN 28.0N 94.8W
2 NaN 28.0N 95.4W
3 NaN 28.0N 96.0W
4 NaN 28.1N 96.5W
5 NaN 28.2N 96.8W
6 NaN 28.2N 97.0W
7 NaN 28.3N 97.6W
8 NaN 28.4N 98.3W
9 NaN 28.6N 98.9W
10 NaN 29.0N 99.4W
11 NaN 29.5N 99.8W
12 NaN 30.0N 100.0W
13 NaN 30.5N 100.1W
14 NaN 31.0N 100.2W
15 AL021851 NaN NaN
16 NaN 22.2N 97.6W
17 AL031851 NaN NaN
18 NaN 12.0N 60.0W
I have been trying the following line of code with the goal to fill the AL|ATFC|Year
column where I have NaN
values with the pandas ffill() function.
df.where(df['AL|ATFC|Year'] == float('NaN'), df['AL|ATFC|Year'].ffill(), axis=1, inplace=True)
To get the following dataframe:
AL|ATFC|Year Latitude Longitude
0 AL011851 NaN NaN
1 AL011851 28.0N 94.8W
2 AL011851 28.0N 95.4W
3 AL011851 28.0N 96.0W
4 AL011851 28.1N 96.5W
5 AL011851 28.2N 96.8W
6 AL011851 28.2N 97.0W
7 AL011851 28.3N 97.6W
8 AL011851 28.4N 98.3W
9 AL011851 28.6N 98.9W
10 AL011851 29.0N 99.4W
11 AL011851 29.5N 99.8W
12 AL011851 30.0N 100.0W
13 AL011851 30.5N 100.1W
14 AL011851 31.0N 100.2W
15 AL021851 NaN NaN
16 AL021851 22.2N 97.6W
17 AL031851 NaN NaN
18 AL031851 12.0N 60.0W
Thereafter, I am planning the drop row with missing Lon/Lat values. However, the code I have been trying to use does not work to fill in the missing values in the AL|ATFC|Year
column and I don't understand why...Any help would be much appreciated!
Thanks
CodePudding user response:
You could replace the 'AL|ATFC|Year'
NaN by np.nan
, and then do use fillna
function. I reproduced only the first 3 rows:
import pandas as pd
data = {'AL|ATFC|Year' : ['AL011851', 'NaN', 'NaN'],
'Latitude': ['NaN', '28.0N', '28.0N'],
'Longitude': ['NaN', '94.8W', '95.4W']}
df = pd.DataFrame(data)
df['AL|ATFC|Year'].replace('NaN', np.nan, inplace=True)
df['AL|ATFC|Year'].fillna(method='ffill', inplace=True)
outputs:
AL|ATFC|Year Latitude Longitude
0 AL011851 NaN NaN
1 AL011851 28.0N 94.8W
2 AL011851 28.0N 95.4W
CodePudding user response:
ffill
function is fill forward the value "where it is NA/NaN value", so you do not need NaN condition in ffill
.
df['AL|ATFC|Year'].ffill(inplace=True)