Home > Software design >  Change values of a column based on the previous row using assign and a variable passed as a paramete
Change values of a column based on the previous row using assign and a variable passed as a paramete

Time:05-30

Let it be the following Python Panda Dataframe:

code visit_time flag other counter
0 NaT True X 3
0 1 days 03:00:12 False Y 1
0 NaT False X 3
0 0 days 05:00:00 True X 2
1 NaT False Z 3
1 NaT True X 3
1 1 days 03:00:12 False Y 1
2 NaT True X 3
2 5 days 10:01:12 True Y 0

To solve the problem, only the columns: code, visit_time and flag are needed.

Each row with a value of visit_time, has a previous row with value NaT. Knowing this, I want to do next modification in the dataframe:

  • Sets the flag of the row with non-null value of visit_time to the same value as its previous row.

Code used @Cameron Riddell:

out = df.assign(
    flag=df['flag'].mask(df['visit_time'].notnull(), df['flag'].shift())
)

print(out)
   code      visit_time   flag other  counter
0     0             NaT   True     X        3
1     0 1 days 03:00:12   True     Y        1
2     0             NaT  False     X        3
3     0 0 days 05:00:00  False     X        2
4     1             NaT  False     Z        3
5     1             NaT   True     X        3
6     1 1 days 03:00:12   True     Y        1
7     2             NaT   True     X        3
8     2 5 days 10:01:12   True     Y        0

The problem is that I want to reuse the code in a function, so the flag column will be stored in a variable, say name. If I use the following code, a new column name is created in the DataFrame.

name = 'flag'
out = df.assign(
    name=df[name].mask(df['visit_time'].notnull(), df[name].shift())
)

How could I get the same functionality but being able to modify the values of the column passed by parameter? I am grateful for the help offered in advance.

CodePudding user response:

You can use ** for unpacks dictionary into keyword arguments:

name = 'flag'
out = df.assign(**{name: df[name].mask(df['visit_time'].notnull(), df[name].shift())})
  • Related