I am trying based on the code seen below to enhance it in order to do what it does, meaning moving each value from the same row onto the next cell if theres a NaN. The adjustment that I am trying to make is if, Jan (1st month of each row) is NaN then fill it in with the last value from the previous year which would be Jun - 2.04
This is what I am using so far:
df.loc[df['Jan'].isna(), 'Jan'] = df[df['Jan'].isna()].apply(lambda x: x[x.notna()][-1], axis=1)
df.loc[:, 'Jan':] = df.loc[:, 'Jan':].ffill(axis=1)
print(df)
Input sample data
Region-INF Series Name Series ID Jan Feb Mar Apr May Jun
Pacific All Items CUUR0490SDD 2.9 2.8 NaN NaN 2.52 **2.04**
Pacific All Items CUE07890SDF NaN 2.64 NaN 2.44 2.59 3
Pacific All Items CUE073310SAF 2.1 2.4 NaN 2.21 3.45 NaN
Expected output:
Region-INF Series Name Series ID Jan Feb Mar Apr May Jun
Pacific All Items CUUR0490SDD 2.9 2.8 2.8 2.8 2.52 **2.04**
Pacific All Items CUE07890SDF **2.04** 2.64 2.64 2.44 2.59 3
Pacific All Items CUE073310SAF 2.1 2.4 2.4 2.21 3.45 3.45
Any suggestions how I can modify the existing code?
CodePudding user response:
You can modify the existing code by first adding a new column that contains the last value from the previous year. Then use that column to fill in the missing values in the 'Jan' column.
df.loc[:, "Jan":] = df.loc[:, "Jan":].ffill(axis=1)
# Shift the values in the last column by one row
df["last_col_shifted"] = df.iloc[:, -1].shift(1)
# Use the shifted column to fill in missing values in the 'Jan' column
df.loc[df["Jan"].isna(), "Jan"] = df["last_col_shifted"]
# remove the shifted column
df.drop(["last_col_shifted"], axis=1, inplace=True)
print(df)