I have the following data.frame
:
> mydf=data.frame(ID=LETTERS, var1=rep(c('a','b'),each=13), var2=c(rep('x',10),rep('y',12),rep('z',4)))
> mydf
ID var1 var2
1 A a x
2 B a x
3 C a x
4 D a x
5 E a x
...
I want to make a list with the levels of each variable.
Each element in the list should be associated with a names
attribute.
The names should be identical to the original element. Then I would want the values changed to variable name original element.
Let me show you what I mean.
I first turn the data.frame
into the list
output I want:
> mylist=lapply(mydf, unique)
> mylist
$ID
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"
$var1
[1] "a" "b"
$var2
[1] "x" "y" "z"
Now, I want to add a names
attribute to the elements, so that names are equal to the original elements, and the new elements are the variable name plus the original elements.
I focus on var1
:
> var1_names=mylist$var1
> var1_values=paste0('var1:',mylist$var1)
> mylist$var1=var1_values
> names(mylist$var1)=var1_names
> mylist
...
$var1
a b
"var1:a" "var1:b"
...
See how var1
has changed from:
$var1
[1] "a" "b"
to
$var1
a b
"var1:a" "var1:b"
Note the names
attribute and how the new values have changed to include the variable name.
Now I would like to do the same thing for each variable in the list.
Is it possible to do it in a simple way with an apply
approach and preferably base functions? Thanks!
Edit: simplify explanation
CodePudding user response:
You mean this?
library(dplyr)
mydf %>%
mutate(across(starts_with("var"), ~paste0(cur_column(),":", .)))
ID var1 var2 value
1 A var1:a var2:x 1
2 B var1:a var2:x 2
3 C var1:a var2:x 3
4 D var1:a var2:x 4
5 E var1:a var2:x 5
6 F var1:a var2:x 6
7 G var1:a var2:x 7
8 H var1:a var2:x 8
9 I var1:a var2:x 9
10 J var1:a var2:x 10
11 K var1:a var2:y 11
12 L var1:a var2:y 12
13 M var1:a var2:y 13
14 N var1:b var2:y 14
15 O var1:b var2:y 15
16 P var1:b var2:y 16
17 Q var1:b var2:y 17
18 R var1:b var2:y 18
19 S var1:b var2:y 19
20 T var1:b var2:y 20
21 U var1:b var2:y 21
22 V var1:b var2:y 22
23 W var1:b var2:z 23
24 X var1:b var2:z 24
25 Y var1:b var2:z 25
26 Z var1:b var2:z 26
CodePudding user response:
Is this what you are after?
lapply(names(mydf), \(x) paste(x, unique(mydf[[x]]), sep = ":"))
[[1]]
[1] "ID:A" "ID:B" "ID:C" "ID:D" "ID:E" "ID:F" "ID:G" "ID:H" "ID:I" "ID:J" "ID:K" "ID:L" "ID:M" "ID:N" "ID:O" "ID:P" "ID:Q" "ID:R" "ID:S" "ID:T" "ID:U" "ID:V" "ID:W" "ID:X" "ID:Y" "ID:Z"
[[2]]
[1] "var1:a" "var1:b"
[[3]]
[1] "var2:x" "var2:y" "var2:z"