Home > database >  Iterating a function with multiple arguments over different variables with R
Iterating a function with multiple arguments over different variables with R

Time:09-12

I have a dataset with 50 observations over 10 variables and I would like to apply the following function over the all variable permutations.

new_fun <- function(data, x, y) {
  x <- data[ , x]
  y <- data[ , y]
  value <- (x - y) / (x   y)
  colnames(value) <- paste(names(x), "/", names(y), sep = "")
  return(value)
}

here is a part of the dataset

var1    var2    var3    var4    var5    var6    var7    var8    var9    var10
1268    1522    1268    1842    4728    5611    5544    2374    1535    5773
1286    1534    1259    1829    4834    5802    5776    2383    1538    5928
1296    1534    1266    1853    4905    5805    5916    2418    1545    5949
1296    1488    1239    1791    4963    5985    5880    2359    1524    6142
1273    1503    1228    1787    4694    5608    5608    2268    1476    5725
1290    1522    1271    1811    4799    5728    5752    2402    1555    5832
1265    1510    1247    1786    4981    6072    6172    2409    1526    6258
1289    1527    1246    1841    4876    5827    5808    2361    1522    6009
1322    1590    1351    1917    4532    5271    5264    2412    1589    5418
1334    1589    1445    1899    3680    4638    4820    2321    1638    4974
1347    1532    1370    1865    3618    4702    4852    2275    1619    4994

The idea is to have a new dataset with 50 observations on 90 columns (n=10, r=2, no repeats).

     var1/var2   var1/var3   var1/var4  ...
1       .            .           .      ...
2       .            .           .      ...
3       .            .           .      ...
.       .            .           .      ...
.       .            .           .      ...
.       .            .           .      ...

I have tried apply functions and loops with no success so far. Any help is greatly appreciated!

CodePudding user response:

You can do this using the tidyverse and the purrr package:

library(tidyverse)

# the data you provided
varst <- as.data.frame(read_csv("var1,var2,var3,var4,var5,var6,var7,var8,var9,var10
1268,1522,1268,1842,4728,5611,5544,2374,1535,5773
1286,1534,1259,1829,4834,5802,5776,2383,1538,5928
1296,1534,1266,1853,4905,5805,5916,2418,1545,5949
1296,1488,1239,1791,4963,5985,5880,2359,1524,6142
1273,1503,1228,1787,4694,5608,5608,2268,1476,5725
1290,1522,1271,1811,4799,5728,5752,2402,1555,5832
1265,1510,1247,1786,4981,6072,6172,2409,1526,6258
1289,1527,1246,1841,4876,5827,5808,2361,1522,6009
1322,1590,1351,1917,4532,5271,5264,2412,1589,5418
1334,1589,1445,1899,3680,4638,4820,2321,1638,4974
1347,1532,1370,1865,3618,4702,4852,2275,1619,4994"))

map_dfc(names(varst), # cycle through each column
        function(x) {         
          # fetch all columns beside x to match
          map(setdiff(names(varst), x),
              function(y){ # your function as above
                v_x <- varst[x]
                v_y <- varst[y]
                ret <- (v_x - v_y) / (v_x   v_y)
                names(ret) <- paste0(x, "/", y)
                ret # return the caluclated values
              })
        })

results

  • Related