I have a dataframe when where the date matches the date of a specific column, I wish to convert all 0's to blanks per area category for specific columns.
Data
date start end area stat1 stat2 stat3 final
10/1/2021 11/1/2021 12/1/2021 NY 5 0 0 11/1/2021
11/1/2021 12/1/2021 1/1/2022 NY 19 0 0 11/1/2021
12/1/2021 1/1/2022 2/1/2022 NY 10 0 0 11/1/2021
1/1/2022 2/1/2022 3/1/2022 NY 1 0 0 11/1/2021
10/1/2021 11/1/2021 12/1/2021 CA 1 0 0 11/1/2021
11/1/2021 12/1/2021 1/1/2022 CA 3 0 0 11/1/2021
12/1/2021 1/1/2022 2/1/2022 CA 3 0 0 11/1/2021
1/1/2022 2/1/2022 3/1/2022 CA 2 0 0 11/1/2021
Desired
date start end area stat1 stat2 stat3 final
10/1/2021 11/1/2021 12/1/2021 NY 5 0 0 11/1/2021
11/1/2021 12/1/2021 1/1/2022 NY 19 11/1/2021
12/1/2021 1/1/2022 2/1/2022 NY 10 11/1/2021
1/1/2022 2/1/2022 3/1/2022 NY 1 11/1/2021
10/1/2021 11/1/2021 12/1/2021 CA 1 0 0 11/1/2021
11/1/2021 12/1/2021 1/1/2022 CA 3 11/1/2021
12/1/2021 1/1/2022 2/1/2022 CA 3 11/1/2021
1/1/2022 2/1/2022 3/1/2022 CA 2 11/1/2021
logic
above we want to convert all zeros to blanks just for columns [stat2] and [stat3] where the date in the [date] column is == '11/01/2021' or greater.
Doing
I am thinking that I must groupby and create a subset and then perform the conversion:
df1 = df.groupby(['date', 'area'], as_index=False
df1[df1.eq(0)] = np.nan
Any suggestion is appreciated.
CodePudding user response:
conditionally execute, using np.where in list squares
df.update(pd.DataFrame(np.transpose([np.where(pd.to_datetime(df['date'],dayfirst=False).ge('2021-11-01'),'','0') for x in df.filter(regex='stat2|stat3').columns]),columns=['stat2', 'stat3']))
or
df[['stat2', 'stat3']] =pd.DataFrame([np.where(pd.to_datetime(df['date'],dayfirst=False).ge('2021-11-01'),'','0') for x in df.filter(regex='stat2|stat3').columns]).T
Or
if easier go the conventional way
s =pd.to_datetime(df['date'],dayfirst=False).ge('2021-11-01')
df = df.assign(
stat2= np.where(s,'','0'),
stat3 = np.where(s,'','0'))