Home > Software engineering >  Python: Set values of DataFrame as NaN based on valid index from another DataFrame
Python: Set values of DataFrame as NaN based on valid index from another DataFrame

Time:10-21

I have two DataFrames:

df1:
ticker           A        B        C
date
2022-01-01       NaN      NaN      100
2022-01-02       NaN      200      NaN
2022-01-03       100      NaN      NaN
2022-01-04       NaN      NaN      120

df2:
ticker           A        B        C
date
2022-01-02       145      233      100
2022-01-03       231      200      241
2022-01-04       100      200      422
2022-01-05       424      324      222
2022-01-06       400      421      320

I want to fill the values in df2 as np.nan for each index and column, where the value in df1 is not null to get the following:

df3:
ticker           A        B        C
date
2022-01-02       145      NaN      100
2022-01-03       NaN      200      241
2022-01-04       100      200      NaN
2022-01-05       424      324      222
2022-01-06       400      421      320

How can this be done Pythonically without going into many loops?

CodePudding user response:

use this:

df2.columns=df2.columns   '2'
final=df.merge(df2,left_on='date',right_on='date2')
final['A2']=np.where(final['A'].notnull(),np.nan,final['A2'])
final['B2']=np.where(final['B'].notnull(),np.nan,final['B2'])
final['C2']=np.where(final['C'].notnull(),np.nan,final['C2'])
final=final[df2.columns]
final=pd.concat([final,df2]).drop_duplicates(subset='date2',keep='first')
final.columns=df.columns
print(final)
'''
    date        A       B       C
0   2022-01-02  145.0   nan     100.0
1   2022-01-03  nan     200.0   241.0
2   2022-01-04  100.0   200.0   nan
3   2022-01-05  424.0   324.0   222.0
4   2022-01-06  400.0   421.0   320.0


'''

CodePudding user response:

for col in df1:
    idx = df1[df1[col].notna()].index
    try:
        df2[col][idx] = np.nan
    except Exception as e:
        print(e)
  • Related