How do I fill information of a blank row from another row where information is available in Python?-CodePudding

Let's say I have a data frame

A	B	C
A1	R3	P5
A2	R2	P4
A2	NaN	P4
A1	NaN	NaN

I want to fill in information for the 3rd and 4th rows from the 2nd and 1st rows since column A values match those two rows. The expected data frame is as follows:

A	B	C
A1	R3	P5
A2	R2	P4
A2	R2	P4
A1	R3	P5

Can you please help me with it?

CodePudding user response：

Assuming a single non-NA value per A

You can use a self merge:

out = df[['A']].merge(df.dropna(), how='left')

Or, if there is a chance of duplicated rows:

out = df[['A']].merge(df.dropna().drop_duplicates(), how='left')

output:

    A   B   C
0  A1  R3  P5
1  A2  R2  P4
2  A2  R2  P4
3  A1  R3  P5

CodePudding user response：

here is one way to do it.

sorting by column 'A', forward filling the null values and then re-sorting by index

df.sort_values(['A'] ).ffill().sort_index()

     A   B   C
0   A1  R3  P5
1   A2  R2  P4
2   A2  R2  P4
3   A1  R3  P5