Home > Back-end >  How can I change the values of columns based on the values from other columns?
How can I change the values of columns based on the values from other columns?

Time:04-22

Here are the tables before cleaned:

name date time_lag1 time_lag2 time_lag3 lags
a 2000/5/3 1 0 1 time_lag1
a 2000/5/10 1 1 0 time_lag2
a 2000/5/17 1 1 1 time_lag3
b 2000/5/3 0 1 0 time_lag1
c 2000/5/3 0 0 0 time_lag1

Logics are simple, each name have several date and that date correspond to a "lags". What I tried to do is to match the column names like "time_lag1","time_lag2",...,"time_lagn" to the values in column "lags". For example, the first value of "time_lag1" is because column name "time_lag1" equals the corresponding value of "lags" which is also "time_lag1". However, I don't know why the values of other columns and rows are becoming incorrect.

My thought is:

# time_lag columns are not following a trend, so it can be lag_time4 as well.
time_list = ['time_lag1','time_lag2','lag_time4'...]
for col in time_list:
   if col == df['lags'].values:
      df.col == 1
   else:
      df.col == 0

I don't know why the codes I tried is not working very well.

Here are the tables I tried to get:

name date time_lag1 time_lag2 time_lag3 lags
a 2000/5/3 1 0 0 time_lag1
a 2000/5/10 0 1 0 time_lag2
a 2000/5/17 0 0 1 time_lag3
b 2000/5/3 1 0 0 time_lag1
c 2000/5/3 1 0 0 time_lag1

CodePudding user response:

The simplest is to recalculate them from scratch with pandas.get_dummies and to update the dataframe:

df.update(pd.get_dummies(df['lags']))

Output:

  name       date  time_lag1  time_lag2  time_lag3       lags
0    a   2000/5/3          1          0          0  time_lag1
1    a  2000/5/10          0          1          0  time_lag2
2    a  2000/5/17          0          0          1  time_lag3
3    b   2000/5/3          1          0          0  time_lag1
4    c   2000/5/3          1          0          0  time_lag1
  • Related