In my data
below, I was wondering how to make a named list of pairwise locations (i.e., row number
) of a dataframe rows in R?
Specifically, I want to do pairwise comparison of every 3 rows. For example, in my Desired_output
, you'll see that I have manually done so for the first 6 rows.
For the first 3 rows in my Desired_output
, I compare third row - first row
whose row numbers are c(3, - 1)
. Then, I compare third row - second row
whose row numbers are c(3, - 2)
. And then, I compare second row - first row
whose row numbers are c(2, - 1)
.
Can we possibly automate this to obtain my desired output below?
Desired_output =
list("4.100 - 1.700 (c2s,complex,t1)"=c(3,-1),
"4.100 - 2.900 (c2s,complex,t1)"=c(3,-2),
"2.900 - 1.700 (c2s,complex,t1)"=c(2,-1),
"4.100 - 1.700 (s2c,complex,t2)"=c(6,-4),
"4.100 - 2.900 (s2c,complex,t2)"=c(6,-5),
"2.900 - 1.700 (s2c,complex,t2)"=c(5,-4),
.
.
.
)
data = read.table(h=TRUE, text = "
Measure motiv t ord task_df
1 ac_EFC/C 1.700 1 c2s complex
2 ac_EFC/C 2.900 1 c2s complex
3 ac_EFC/C 4.100 1 c2s complex
4 ac_EFC/C 1.700 2 s2c complex
5 ac_EFC/C 2.900 2 s2c complex
6 ac_EFC/C 4.100 2 s2c complex
7 ac_EFC/C 1.700 3 c2s complex
8 ac_EFC/C 2.900 3 c2s complex
9 ac_EFC/C 4.100 3 c2s complex
10 ac_EFC/C 1.700 4 s2c complex
11 ac_EFC/C 2.900 4 s2c complex
12 ac_EFC/C 4.100 4 s2c complex
13 ac_EFC/C 1.700 1 s2c simple
14 ac_EFC/C 2.900 1 s2c simple
15 ac_EFC/C 4.100 1 s2c simple
16 ac_EFC/C 1.700 2 c2s simple
17 ac_EFC/C 2.900 2 c2s simple
18 ac_EFC/C 4.100 2 c2s simple
19 ac_EFC/C 1.700 3 s2c simple
20 ac_EFC/C 2.900 3 s2c simple
21 ac_EFC/C 4.100 3 s2c simple
22 ac_EFC/C 1.700 4 c2s simple
23 ac_EFC/C 2.900 4 c2s simple
24 ac_EFC/C 4.100 4 c2s simple")
CodePudding user response:
We may use combn
after grouping - create a sequence column rn
, then concatenate the 'ord', 'task_df', 't' columns to create a grouping column 'grp', grouped by the column, apply combn
on the 'motiv' to return a pairwise combination of elements, paste (str_c
) them, as well as create a pairwise index from the 'rn', created, unite
the 'val', 'grp' columns and the set the names of the list column 'ind' with 'val'
library(dplyr)
library(stringr)
library(tidyr)
data %>%
mutate(rn = row_number()) %>%
mutate(grp = sprintf('(%s,%s,t%s)', ord, task_df, t)) %>%
group_by(grp = factor(grp, levels = unique(grp)) ) %>%
# or use summarise, but reframe will be preferred when the number
# of rows are expanded
reframe(val = combn(motiv, 2,
FUN = \(x) str_c(x[2], ' - ', x[1])),
ind = combn(rn, 2, \(x) c(x[2], -x[1]), simplify = FALSE)) %>%
unite(val, val, grp, sep = " ") %>%
with(., setNames(ind, val))
-output
`2.9 - 1.7 (c2s,complex,t1)`
[1] 2 -1
$`4.1 - 1.7 (c2s,complex,t1)`
[1] 3 -1
$`4.1 - 2.9 (c2s,complex,t1)`
[1] 3 -2
$`2.9 - 1.7 (s2c,complex,t2)`
[1] 5 -4
$`4.1 - 1.7 (s2c,complex,t2)`
[1] 6 -4
$`4.1 - 2.9 (s2c,complex,t2)`
[1] 6 -5
$`2.9 - 1.7 (c2s,complex,t3)`
[1] 8 -7
$`4.1 - 1.7 (c2s,complex,t3)`
[1] 9 -7
$`4.1 - 2.9 (c2s,complex,t3)`
[1] 9 -8
$`2.9 - 1.7 (s2c,complex,t4)`
[1] 11 -10
$`4.1 - 1.7 (s2c,complex,t4)`
[1] 12 -10
$`4.1 - 2.9 (s2c,complex,t4)`
[1] 12 -11
$`2.9 - 1.7 (s2c,simple,t1)`
[1] 14 -13
$`4.1 - 1.7 (s2c,simple,t1)`
[1] 15 -13
$`4.1 - 2.9 (s2c,simple,t1)`
[1] 15 -14
$`2.9 - 1.7 (c2s,simple,t2)`
[1] 17 -16
$`4.1 - 1.7 (c2s,simple,t2)`
[1] 18 -16
$`4.1 - 2.9 (c2s,simple,t2)`
[1] 18 -17
$`2.9 - 1.7 (s2c,simple,t3)`
[1] 20 -19
$`4.1 - 1.7 (s2c,simple,t3)`
[1] 21 -19
$`4.1 - 2.9 (s2c,simple,t3)`
[1] 21 -20
$`2.9 - 1.7 (c2s,simple,t4)`
[1] 23 -22
$`4.1 - 1.7 (c2s,simple,t4)`
[1] 24 -22
$`4.1 - 2.9 (c2s,simple,t4)`
[1] 24 -23