Home > Enterprise >  Modify values based on condition in R
Modify values based on condition in R

Time:03-11

I have the following dataframe:

x   income

1   46200           
2   89345           
3   189982          
4   255465          
5   301189          
6   18100           
7   55234           
8   900672          
9   108221          
10  22201

I am attempting to apply a different function to each value depending on the range it lies within (e.g. 0-20k, 20-50k, 50-90k, 90-140k, & >140k). This function will look something like:

Average tax rate = ((Tax from prior marginal bracket ((income - current bracket lower bound)*marginal rate))/income

I have attempted using a for loop with if & else if conditions however I'm having little success in applying this to my df, I'm not sure what my alternatives are in performing the required manipulation. I'm also unsure what R function would be best for such manipulation. Any help is greatly appreciated as R is new to me.

edit

Tax schedule & prior taxable amounts

Reproducible example:

Income = 60,000

Avgtax = ((6000 ((60000 - 50000)*0.35))/60000

Avgtax = 15.83%

CodePudding user response:

Here's the data in a more friendly reproducible format with the specific example given added in as the last row

dd <- read.table(text="
x   income
1   46200           
2   89345           
3   189982          
4   255465          
5   301189          
6   18100           
7   55234           
8   900672          
9   108221          
10  22201
11  60000", header=T)

First, move your tax data out of an image and into a table

taxinfo <- read.table(text="
min max rate prior
0 20000 0 0
20000 50000 .2 0
50000 90000 .35 6000
90000 140000 .4 20000
140000 Inf .5 40000", header=TRUE)

Now we can more easily extract relevant info from this table using the row index for each bracket. We can easily find the bracket for each individual using cut()

dd$bracket <- cut(dd$income, breaks=c(-Inf, taxinfo$max),labels = FALSE)

We use the max values from the taxinfo table to create the breaks. We assign this value to a new column in the table. Now we know the index for each bracket the person is in. We can use that info to index into the taxinfo table to do the calculation

dd$avgtax <- with(taxinfo, 
  (prior[dd$bracket]   (dd$income-min[dd$bracket])*rate[dd$bracket]) / dd$income
)

The values of prior, min and rate come from taxinfo. We use the bracket value to index into the columns for the values for each group. We assign this to a new column of the data as well. This is the output

    x income bracket     avgtax
1   1  46200       2 0.11341991
2   2  89345       3 0.22128547
3   3 189982       5 0.34209030
4   4 255465       5 0.38256708
5   5 301189       5 0.40039477
6   6  18100       1 0.00000000
7   7  55234       3 0.14179491
8   8 900672       5 0.46669154
9   9 108221       4 0.25215439
10 10  22201       2 0.01982794
11 11  60000       3 0.15833333

CodePudding user response:

This is relatively straightforward with the dplyr package. First, here's a bit of your sample data that we can work with as an example:

df <- data.frame(
  x = c(1, 2, 3, 4, 5, 6),
  income = c(46200, 89345, 189982, 255465, 301189, 60000)
)
df

Output:

  x income
1 1  46200
2 2  89345
3 3 189982
4 4 255465
5 5 301189
6 6  60000

Now, we can derive the other values we need and perform the final calculation:

library(dplyr)

df %>% 
  mutate(
    marginal_rate = case_when(
      income < 20000 ~ 0, 
      income >= 20000 & income < 50000 ~ .2,
      income >= 50000 & income < 90000 ~ .35,
      income >= 90000 & income < 140000 ~ .4,
      income >= 140000 ~ .5
    ), 
    prior_taxable_amount = case_when(
      income < 50000 ~ 0, 
      income >= 50000 & income < 90000 ~ 6000,
      income >= 90000 & income < 140000 ~ 20000,
      income >= 140000 ~ 40000
    ), 
    current_bracket_lower_bound = case_when(
      income < 20000 ~ 0, 
      income >= 20000 & income < 50000 ~ 20000,
      income >= 50000 & income < 90000 ~ 50000,
      income >= 90000 & income < 140000 ~ 90000,
      income >= 140000 ~ 140000
    ), 
    avgtax = 
      (prior_taxable_amount   
      ((income - current_bracket_lower_bound) * 
          marginal_rate)) / income
  ) 

Output:

  x income marginal_rate prior_taxable_amount current_bracket_lower_bound    avgtax
1 1  46200          0.20                    0                       20000 0.1134199
2 2  89345          0.35                 6000                       50000 0.2212855
3 3 189982          0.50                40000                      140000 0.3420903
4 4 255465          0.50                40000                      140000 0.3825671
5 5 301189          0.50                40000                      140000 0.4003948
6 6  60000          0.35                 6000                       50000 0.1583333
  • Related