I have data as follows:
df <- structure(list(ID = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2,
2, 2), year = c(2001, 2002, 2003, 2001, 2002, 2003, 2001, 2002,
2003, 2001, 2002, 2003, 2001, 2002, 2003), Type = c("A", "A",
"A", "B", "B", "B", "A", "A", "A", "B", "B", "B", "C", "C", "C"
), Subtype = c(2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2),
Value = c(0.480513615083894, 0.909788893002047, 0.685141970365005,
0.138835747632889, 0.899508237239289, 0.535632890739584,
0.0712054637209442, 0.655905506366812, 0.694753916517691,
0.469249523993816, 0.295044859429007, 0.209906890342936,
0.193574644156237, 0.0715219759792846, 0.626529278499682)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -15L))
df <- setDT(df)[,mVal:=mean(Value), by=Type]
table(df$mVal, df$Type)
A B C
0.297208632878401 0 0 3
0.424696358229587 0 6 0
0.582884894176066 6 0 0
I really like the information that this table provides, so I was wondering whether there is an easy way to convert this table into a more proper format as below:
Desired output:
mVal N Type
0.297208632878401 3 C
0.424696358229587 6 B
0.582884894176066 6 A
CodePudding user response:
We can convert to data.frame
directly from table
object to return a long data.frame and subset
where Freq
is not 0
out <- subset(as.data.frame(table(df$mVal, df$Type)), Freq != 0)
names(out) <- c("mVal", "Type", "N")
-output
> out
mVal Type N
3 0.582884894176065 A 6
5 0.424696358229587 B 6
7 0.297208632878401 C 3
As the object is a tibble
, we could also use tidyverse
solutions to directly get the long format instead of doing the table
and then reshaping
library(dplyr)
df %>%
count(mVal, Type, name = "N")
mVal Type N
<num> <char> <int>
1: 0.2972086 C 3
2: 0.4246964 B 6
3: 0.5828849 A 6