Home > front end >  Subtract number from dataframe depending on a corresponding number in the dataframe
Subtract number from dataframe depending on a corresponding number in the dataframe

Time:04-27

I have a dataframe of 2 columns, say df:

 year       cases
 1.1         12
 1.2         14
 1.4         19
 1.6         23
 1.6         14
 2.1         26
 2.5         27
 2.7         35
 3.1         21
 3.3         24
 3.8         28

and a list of false cases, say f

 f = [3,4,8]

I want to write a code so that for every 1 year, the number of cases is subtracted by its respective 'false cases'.

So for example, whilst 1 < year < 2, I want: cases - 3

Then when 2 < year < 3, I want: cases - 4

and when 3 < year < 4, I want: cases - 8

and so on

so that a new column, say actual cases is:

 year     actual cases
 1.1         9            (12-3)
 1.2         11           (14-3)
 1.4         16           (19-3)
 1.6         20           (23-3)
 1.6         11           (14-3)
 2.1         22           (26-4)
 2.5         23           (27-4)
 2.7         31           (35-4)
 3.1         13           (21-8)
 3.3         16           (24-8)
 3.8         20           (28-8)

I tried something along the lines of

 for i in range(0,df[["year"]:
     if int(df[["year"][i]) > int(df[["year"][i 1]):
         df[["cases"][i] - f[i]

But this is clearly wrong and I am not sure what to do.

CodePudding user response:

You can do something like this:

df['cases'] - (df['year']//1).astype(int).map({e:i for e, i in enumerate(f, 1)})

or

df['cases'] - pd.Series(f).reindex(df['year']//1-1).to_numpy()

CodePudding user response:

Something like this should work:

def my_fun(df, year, factor):
    df['cases'][df['year'].astype(int) == year] = df['cases'][df['year'].astype(int) == year] - factor
    return df

CodePudding user response:

I would do it like this:

f = [3, 4, 8]

for i, row in df.iterrows():
    if 1<=row["year"]<2:
        df.at[i, "case"] = row["case"] - f[0]
    elif 2<=row["year"]<3:
        df.at[i, "case"] = row["case"] - f[1]
    else:
        df.at[i, "case"] = row["case"] - f[2]

The original dataframe:

   year  case
0   1.0     8
1   1.1     5
2   1.2    17
3   1.3     1
4   1.4    12

The result:

   year  case
0   1.0     5
1   1.1     2
2   1.2    14
3   1.3    -2
4   1.4     9
  • Related