I have a dataframe of 2 columns, say df:
year cases
1.1 12
1.2 14
1.4 19
1.6 23
1.6 14
2.1 26
2.5 27
2.7 35
3.1 21
3.3 24
3.8 28
and a list of false cases, say f
f = [3,4,8]
I want to write a code so that for every 1 year, the number of cases is subtracted by its respective 'false cases'.
So for example, whilst 1 < year < 2, I want: cases - 3
Then when 2 < year < 3, I want: cases - 4
and when 3 < year < 4, I want: cases - 8
and so on
so that a new column, say actual cases is:
year actual cases
1.1 9 (12-3)
1.2 11 (14-3)
1.4 16 (19-3)
1.6 20 (23-3)
1.6 11 (14-3)
2.1 22 (26-4)
2.5 23 (27-4)
2.7 31 (35-4)
3.1 13 (21-8)
3.3 16 (24-8)
3.8 20 (28-8)
I tried something along the lines of
for i in range(0,df[["year"]:
if int(df[["year"][i]) > int(df[["year"][i 1]):
df[["cases"][i] - f[i]
But this is clearly wrong and I am not sure what to do.
CodePudding user response:
You can do something like this:
df['cases'] - (df['year']//1).astype(int).map({e:i for e, i in enumerate(f, 1)})
or
df['cases'] - pd.Series(f).reindex(df['year']//1-1).to_numpy()
CodePudding user response:
Something like this should work:
def my_fun(df, year, factor):
df['cases'][df['year'].astype(int) == year] = df['cases'][df['year'].astype(int) == year] - factor
return df
CodePudding user response:
I would do it like this:
f = [3, 4, 8]
for i, row in df.iterrows():
if 1<=row["year"]<2:
df.at[i, "case"] = row["case"] - f[0]
elif 2<=row["year"]<3:
df.at[i, "case"] = row["case"] - f[1]
else:
df.at[i, "case"] = row["case"] - f[2]
The original dataframe:
year case
0 1.0 8
1 1.1 5
2 1.2 17
3 1.3 1
4 1.4 12
The result:
year case
0 1.0 5
1 1.1 2
2 1.2 14
3 1.3 -2
4 1.4 9