I have the following dataframe:
Periods 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Item 1 89.0 89.0 88.0 86.0 73.0 73.0 67.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Item 2 197.0 197.0 157.0 157.0 156.0 156.0 153.0 153.0 3.0 3.0 2.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN
Item 3 105.0 105.0 98.0 47.0 40.0 21.0 20.0 15.0 13.0 11.0 11.0 11.0 10.0 7.0 7.0 5.0 5.0 3.0 2.0 1.0
It has several NaNs, and columns from a range between 0 - 20. What i wanted, is to replicate the first column to the next columns, until this value is not NaN. Let me show you an example of what it should look like:
Periods 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Item 1 89.0 89.0 89.0 89.0 89.0 89.0 89.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Item 2 197.0 197.0 197.0 197.0 197.0 197.0 197.0 197.0 197.0 197.0 197.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN
Item 3 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0
But I can't figure out a way to do this. I wish i could post some code, but all I tried was to search for those methods: backfill, ffill. None of them looked like what I needed.
Any ideas? Thanks in advance.
CodePudding user response:
I am assuming here that "Periods" is the index
You can use mask
on the columns with df.notna()
as mask, and the first column (df.iloc[:, 0]
) as replacement values:
df.mask(df.notna(), df.iloc[:, 0], axis=0)
output:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Periods
Item 1 89.0 89.0 89.0 89.0 89.0 89.0 89.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Item 2 197.0 197.0 197.0 197.0 197.0 197.0 197.0 197.0 197.0 197.0 197.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN
Item 3 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0 105.0
CodePudding user response:
Here's another idea, just for the record: using map
:
df.iloc[0] = df.iloc[0].notna().map({True: df.iloc[0, 0]})
CodePudding user response:
I will do mul
df.notna().mul(df.iloc[:,0],axis=0).mask(df.isna())