Home > database >  Check values in two different columns and replace non-identical values
Check values in two different columns and replace non-identical values

Time:06-10

I have this df

A B
111 4
111 4
112 0
112 2
113 3
113 3
114 nan
114 1

I want to replace nan and 0 values with other values from col B for the corresponding item from col A as follows:

A B
111 4
111 4
112 2
112 2
113 3
113 3
114 1
114 1

I tried this but this not returning the correct values

df['B'].fillna(0)
df=df.merge(df[B > 0].groupby('$LINK:NO').size().reset_index(name='B'), on='A')

CodePudding user response:

Replace values less or equal 0 to missing values in Series.where, so possible get first non missing values per groups by GroupBy.transform with GroupBy.first:

df['B'] = (df.assign(new = df['B'].where(df['B'].gt(0)))
             .groupby('A')['new']
             .transform('first'))
print (df)
     A    B
0  111  4.0
1  111  4.0
2  112  2.0
3  112  2.0
4  113  3.0
5  113  3.0
6  114  1.0
7  114  1.0

Another idea is sorting use max:

df['B'] = df.sort_values('B').groupby('A').transform('max')
print (df)
     A    B
0  111  4.0
1  111  4.0
2  112  2.0
3  112  2.0
4  113  3.0
5  113  3.0
6  114  1.0
7  114  1.0
  • Related