Home > database >  Is there an R function to apply a vector to a column in a df, and create new columns?
Is there an R function to apply a vector to a column in a df, and create new columns?

Time:07-01

I am using tidyverse to do some analysis to a dataset of home prices. I have home prices over time, and I would like to create new columns by applying a vector of down payments to output a new column for each down payment percentage.

set.seed(100)
dates <- seq.Date(from = as.Date('2010-01-01'), to = as.Date('2010-12-01'), by = 'months')
prices <- rnorm(12, mean = 100000, sd = 5000)
 
home_vals <- data.frame(dates, prices)
 
out <- home_vals %>%
  mutate(
    DownPayment_20 = 0.2 * prices
    ,DownPayment_10 = 0.1 * prices
    ,DownPayment_5 = 0.05 * prices
  )

head(out)
       dates    prices DownPayment_20 DownPayment_10 DownPayment_5
1 2010-01-01  97489.04       19497.81       9748.904      4874.452
2 2010-02-01 100657.66       20131.53      10065.766      5032.883
3 2010-03-01  99605.41       19921.08       9960.541      4980.271
4 2010-04-01 104433.92       20886.78      10443.392      5221.696
5 2010-05-01 100584.86       20116.97      10058.486      5029.243
6 2010-06-01 101593.15       20318.63      10159.315      5079.658

I'd like to be able to do this for any number of down payment inputs, by just passing a vector like down_payments <- c(0.2, 0.1, 0.05), but I cannot figure out a straightforward way to do this without manually mutating every column. Additionally, I might want to include other vectors with inputs, and use both the down payment vector and the new vector in some calculation, and have a new column for that amount.

Thanks!

CodePudding user response:

How about this:

library(dplyr)
library(purrr)
library(tidyr)
set.seed(100)
dates <- seq.Date(from = as.Date('2010-01-01'), to = as.Date('2010-12-01'), by = 'months')
prices <- rnorm(12, mean = 100000, sd = 5000)

home_vals <- data.frame(dates, prices)

downs <- c(.2, .1, .05)

map(downs, function(x){
  home_vals %>% 
    mutate(!!sym(paste0("DownPayment_", round(x*100))) := prices*x)
}) %>% 
  do.call(coalesce, .)
#>         dates    prices DownPayment_20 DownPayment_10 DownPayment_5
#> 1  2010-01-01  97489.04       19497.81       9748.904      4874.452
#> 2  2010-02-01 100657.66       20131.53      10065.766      5032.883
#> 3  2010-03-01  99605.41       19921.08       9960.541      4980.271
#> 4  2010-04-01 104433.92       20886.78      10443.392      5221.696
#> 5  2010-05-01 100584.86       20116.97      10058.486      5029.243
#> 6  2010-06-01 101593.15       20318.63      10159.315      5079.658
#> 7  2010-07-01  97091.05       19418.21       9709.105      4854.552
#> 8  2010-08-01 103572.66       20714.53      10357.266      5178.633
#> 9  2010-09-01  95873.70       19174.74       9587.370      4793.685
#> 10 2010-10-01  98200.69       19640.14       9820.069      4910.034
#> 11 2010-11-01 100449.43       20089.89      10044.943      5022.472
#> 12 2010-12-01 100481.37       20096.27      10048.137      5024.069

Created on 2022-06-30 by the reprex package (v2.0.1)

CodePudding user response:

With data.table:

library(data.table)

setDT(home_vals)
home_vals[,(paste0("DownPayment_",down_payments*100)):=lapply(down_payments,\(x) x*prices)][]

         dates    prices DownPayment_20 DownPayment_10 DownPayment_5
        <Date>     <num>          <num>          <num>         <num>
 1: 2010-01-01  97489.04       19497.81       9748.904      4874.452
 2: 2010-02-01 100657.66       20131.53      10065.766      5032.883
 3: 2010-03-01  99605.41       19921.08       9960.541      4980.271
 4: 2010-04-01 104433.92       20886.78      10443.392      5221.696
 5: 2010-05-01 100584.86       20116.97      10058.486      5029.243
 6: 2010-06-01 101593.15       20318.63      10159.315      5079.658
 7: 2010-07-01  97091.05       19418.21       9709.105      4854.552
 8: 2010-08-01 103572.66       20714.53      10357.266      5178.633
 9: 2010-09-01  95873.70       19174.74       9587.370      4793.685
10: 2010-10-01  98200.69       19640.14       9820.069      4910.034
11: 2010-11-01 100449.43       20089.89      10044.943      5022.472
12: 2010-12-01 100481.37       20096.27      10048.137      5024.069

CodePudding user response:

in Base R use the outer function or simply %o%

cbind(home_vals, home_vals$prices %o% setNames(downs, paste0('DownPayment_', downs)))

        dates    prices DownPayment_0.2 DownPayment_0.1 DownPayment_0.05
1  2010-01-01  97489.04        19497.81        9748.904         4874.452
2  2010-02-01 100657.66        20131.53       10065.766         5032.883
3  2010-03-01  99605.41        19921.08        9960.541         4980.271
4  2010-04-01 104433.92        20886.78       10443.392         5221.696
5  2010-05-01 100584.86        20116.97       10058.486         5029.243
6  2010-06-01 101593.15        20318.63       10159.315         5079.658
7  2010-07-01  97091.05        19418.21        9709.105         4854.552
8  2010-08-01 103572.66        20714.53       10357.266         5178.633
9  2010-09-01  95873.70        19174.74        9587.370         4793.685
10 2010-10-01  98200.69        19640.14        9820.069         4910.034
11 2010-11-01 100449.43        20089.89       10044.943         5022.472
12 2010-12-01 100481.37        20096.27       10048.137         5024.069
  • Related