I have the following dataframe:
x income
1 46200
2 89345
3 189982
4 255465
5 301189
6 18100
7 55234
8 900672
9 108221
10 22201
I am attempting to apply a different function to each value depending on the range it lies within (e.g. 0-20k, 20-50k, 50-90k, 90-140k, & >140k). This function will look something like:
Average tax rate = ((Tax from prior marginal bracket ((income - current bracket lower bound)*marginal rate))/income
I have attempted using a for loop with if & else if conditions however I'm having little success in applying this to my df, I'm not sure what my alternatives are in performing the required manipulation. I'm also unsure what R function would be best for such manipulation. Any help is greatly appreciated as R is new to me.
edit
Tax schedule & prior taxable amounts
Reproducible example:
Income = 60,000
Avgtax = ((6000 ((60000 - 50000)*0.35))/60000
Avgtax = 15.83%
CodePudding user response:
Here's the data in a more friendly reproducible format with the specific example given added in as the last row
dd <- read.table(text="
x income
1 46200
2 89345
3 189982
4 255465
5 301189
6 18100
7 55234
8 900672
9 108221
10 22201
11 60000", header=T)
First, move your tax data out of an image and into a table
taxinfo <- read.table(text="
min max rate prior
0 20000 0 0
20000 50000 .2 0
50000 90000 .35 6000
90000 140000 .4 20000
140000 Inf .5 40000", header=TRUE)
Now we can more easily extract relevant info from this table using the row index for each bracket. We can easily find the bracket for each individual using cut()
dd$bracket <- cut(dd$income, breaks=c(-Inf, taxinfo$max),labels = FALSE)
We use the max values from the taxinfo
table to create the breaks. We assign this value to a new column in the table. Now we know the index for each bracket the person is in. We can use that info to index into the taxinfo
table to do the calculation
dd$avgtax <- with(taxinfo,
(prior[dd$bracket] (dd$income-min[dd$bracket])*rate[dd$bracket]) / dd$income
)
The values of prior
, min
and rate
come from taxinfo. We use the bracket value to index into the columns for the values for each group. We assign this to a new column of the data as well. This is the output
x income bracket avgtax
1 1 46200 2 0.11341991
2 2 89345 3 0.22128547
3 3 189982 5 0.34209030
4 4 255465 5 0.38256708
5 5 301189 5 0.40039477
6 6 18100 1 0.00000000
7 7 55234 3 0.14179491
8 8 900672 5 0.46669154
9 9 108221 4 0.25215439
10 10 22201 2 0.01982794
11 11 60000 3 0.15833333
CodePudding user response:
This is relatively straightforward with the dplyr
package. First, here's a bit of your sample data that we can work with as an example:
df <- data.frame(
x = c(1, 2, 3, 4, 5, 6),
income = c(46200, 89345, 189982, 255465, 301189, 60000)
)
df
Output:
x income
1 1 46200
2 2 89345
3 3 189982
4 4 255465
5 5 301189
6 6 60000
Now, we can derive the other values we need and perform the final calculation:
library(dplyr)
df %>%
mutate(
marginal_rate = case_when(
income < 20000 ~ 0,
income >= 20000 & income < 50000 ~ .2,
income >= 50000 & income < 90000 ~ .35,
income >= 90000 & income < 140000 ~ .4,
income >= 140000 ~ .5
),
prior_taxable_amount = case_when(
income < 50000 ~ 0,
income >= 50000 & income < 90000 ~ 6000,
income >= 90000 & income < 140000 ~ 20000,
income >= 140000 ~ 40000
),
current_bracket_lower_bound = case_when(
income < 20000 ~ 0,
income >= 20000 & income < 50000 ~ 20000,
income >= 50000 & income < 90000 ~ 50000,
income >= 90000 & income < 140000 ~ 90000,
income >= 140000 ~ 140000
),
avgtax =
(prior_taxable_amount
((income - current_bracket_lower_bound) *
marginal_rate)) / income
)
Output:
x income marginal_rate prior_taxable_amount current_bracket_lower_bound avgtax
1 1 46200 0.20 0 20000 0.1134199
2 2 89345 0.35 6000 50000 0.2212855
3 3 189982 0.50 40000 140000 0.3420903
4 4 255465 0.50 40000 140000 0.3825671
5 5 301189 0.50 40000 140000 0.4003948
6 6 60000 0.35 6000 50000 0.1583333