Home > Mobile >  How to add function to multi columns?
How to add function to multi columns?

Time:11-08

I have the df:

lst_1 = ['November-2022', 'October-2022', 'September-2022', 'November-2022']
lst_2 = ['', '', '', '', '']
lst_3 = ['', '', '', '', '']
lst_4 = ['', '', '', '', '']

df1 = pd.DataFrame(list(zip(lst_1 , lst_2, lst_3, lst_4)),
              columns =['Date_updated', 'November-2022', 'October-2022', 'September-2022'])

I want to write a function to have df2 like this:

lst_1 = ['November-2022', 'October-2022', 'September-2022', 'November-2022']
lst_2 = ['x', '', '', 'x', '']
lst_3 = ['', 'x', '', '', '']
lst_4 = ['', '', 'x', '', '']

df2 = pd.DataFrame(list(zip(lst_1 , lst_2, lst_3, lst_4)),
              columns =['Date_updated', 'November-2022', 'October-2022', 'September-2022'])

When row['Date_updated'] == column names of df1, it will tick 'x' to that columns with the same name. I tried this, but not work:

df1['Date_Updated'] = df1['Date_Updated'].astype(str)
lst_date = list(df1['Date_Updated'].tolist())
def add_tick(r):
    for i in lst_date:
        if r['Date_Updated'] == r[i]:
            return 'X'
        continue
for i in lst_date:
    df1[i] = df1.apply(add_tick, axis = 1)

CodePudding user response:

Use pd.crosstab:

new_df = df.set_index('Date_updated')
output = pd.crosstab(index=new_df.index, columns=new_df.columns, values='x', 
                     aggfunc=lambda x: x).fillna('')

Output:

col_0   November-2022   October-2022    September-2022
row_0           
November-2022   x                            
October-2022                 x               
September-2022                               x

CodePudding user response:

Use get_dummies with DataFrame.replace and DataFrame.update:

df1.update(pd.get_dummies(df1['Date_updated']).replace({1:'X', 0:''}))
print (df1)
     Date_updated November-2022 October-2022 September-2022
0   November-2022             X                            
1    October-2022                          X               
2  September-2022                                         X
3   November-2022             X       

Details:

print (pd.get_dummies(df1['Date_updated']))
   November-2022  October-2022  September-2022
0              1             0               0
1              0             1               0
2              0             0               1
3              1             0               0

Another idea without update datetimes columns:

df2 = df1[['Date_updated']].join(pd.get_dummies(df1['Date_updated']).replace({1:'X', 0:''}))
print (df2)
     Date_updated November-2022 October-2022 September-2022
0   November-2022             X                            
1    October-2022                          X               
2  September-2022                                         X
3   November-2022             X                      
  • Related