I have a dataset with 50 observations over 10 variables and I would like to apply the following function over the all variable permutations.
new_fun <- function(data, x, y) {
x <- data[ , x]
y <- data[ , y]
value <- (x - y) / (x y)
colnames(value) <- paste(names(x), "/", names(y), sep = "")
return(value)
}
here is a part of the dataset
var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
1268 1522 1268 1842 4728 5611 5544 2374 1535 5773
1286 1534 1259 1829 4834 5802 5776 2383 1538 5928
1296 1534 1266 1853 4905 5805 5916 2418 1545 5949
1296 1488 1239 1791 4963 5985 5880 2359 1524 6142
1273 1503 1228 1787 4694 5608 5608 2268 1476 5725
1290 1522 1271 1811 4799 5728 5752 2402 1555 5832
1265 1510 1247 1786 4981 6072 6172 2409 1526 6258
1289 1527 1246 1841 4876 5827 5808 2361 1522 6009
1322 1590 1351 1917 4532 5271 5264 2412 1589 5418
1334 1589 1445 1899 3680 4638 4820 2321 1638 4974
1347 1532 1370 1865 3618 4702 4852 2275 1619 4994
The idea is to have a new dataset with 50 observations on 90 columns (n=10, r=2, no repeats).
var1/var2 var1/var3 var1/var4 ...
1 . . . ...
2 . . . ...
3 . . . ...
. . . . ...
. . . . ...
. . . . ...
I have tried apply functions and loops with no success so far. Any help is greatly appreciated!
CodePudding user response:
You can do this using the tidyverse
and the purrr
package:
library(tidyverse)
# the data you provided
varst <- as.data.frame(read_csv("var1,var2,var3,var4,var5,var6,var7,var8,var9,var10
1268,1522,1268,1842,4728,5611,5544,2374,1535,5773
1286,1534,1259,1829,4834,5802,5776,2383,1538,5928
1296,1534,1266,1853,4905,5805,5916,2418,1545,5949
1296,1488,1239,1791,4963,5985,5880,2359,1524,6142
1273,1503,1228,1787,4694,5608,5608,2268,1476,5725
1290,1522,1271,1811,4799,5728,5752,2402,1555,5832
1265,1510,1247,1786,4981,6072,6172,2409,1526,6258
1289,1527,1246,1841,4876,5827,5808,2361,1522,6009
1322,1590,1351,1917,4532,5271,5264,2412,1589,5418
1334,1589,1445,1899,3680,4638,4820,2321,1638,4974
1347,1532,1370,1865,3618,4702,4852,2275,1619,4994"))
map_dfc(names(varst), # cycle through each column
function(x) {
# fetch all columns beside x to match
map(setdiff(names(varst), x),
function(y){ # your function as above
v_x <- varst[x]
v_y <- varst[y]
ret <- (v_x - v_y) / (v_x v_y)
names(ret) <- paste0(x, "/", y)
ret # return the caluclated values
})
})