Let it be the following Python Panda Dataframe:
code | visit_time | flag | other | counter |
---|---|---|---|---|
0 | NaT | True | X | 3 |
0 | 1 days 03:00:12 | False | Y | 1 |
0 | NaT | False | X | 3 |
0 | 0 days 05:00:00 | True | X | 2 |
1 | NaT | False | Z | 3 |
1 | NaT | True | X | 3 |
1 | 1 days 03:00:12 | False | Y | 1 |
2 | NaT | True | X | 3 |
2 | 5 days 10:01:12 | True | Y | 0 |
To solve the problem, only the columns: code, visit_time
and flag
are needed.
Each row with a value of visit_time
, has a previous row with value NaT
. Knowing this, I want to do next modification in the dataframe:
- Sets the flag of the row with non-null value of
visit_time
to the same value as its previous row.
Code used @Cameron Riddell:
out = df.assign(
flag=df['flag'].mask(df['visit_time'].notnull(), df['flag'].shift())
)
print(out)
code visit_time flag other counter
0 0 NaT True X 3
1 0 1 days 03:00:12 True Y 1
2 0 NaT False X 3
3 0 0 days 05:00:00 False X 2
4 1 NaT False Z 3
5 1 NaT True X 3
6 1 1 days 03:00:12 True Y 1
7 2 NaT True X 3
8 2 5 days 10:01:12 True Y 0
The problem is that I want to reuse the code in a function, so the flag
column will be stored in a variable, say name
. If I use the following code, a new column name
is created in the DataFrame.
name = 'flag'
out = df.assign(
name=df[name].mask(df['visit_time'].notnull(), df[name].shift())
)
How could I get the same functionality but being able to modify the values of the column passed by parameter? I am grateful for the help offered in advance.
CodePudding user response:
You can use **
for unpacks dictionary into keyword arguments:
name = 'flag'
out = df.assign(**{name: df[name].mask(df['visit_time'].notnull(), df[name].shift())})