I have multiple dataframes and there are couple of dfs having similar IDs. I want to merge those dfs if they have similar row count.
df1_ID1
ID b c d
1 x y z
1 y z x
1 z x y
df2_ID1
ID b c d
1 x y z
1 y z x
1 z x y
df3_ID2
ID b c d
2 x y z
2 y z x
2 z x y
what I want is
ID1
ID b c d b c d
1 x y z x y z
1 y z x y z x
1 z x y z x y
I'm trying something like this code:
for (i in 1:length(df)) {
columnnames<-df[[i]][["b"]]
if (df[[i]][["ID"]] == df[[i ]][["ID"]])
merge.data.frame(df[[i]],df[[i ]], by = "ID")
}
CodePudding user response:
I'm not a fan of how this ends up with multiple columns with the same name, but that's what you wanted.
You aren't really asking for a merge because that would give 3 x 3 = 9 rows, so I used cbind.
(I changed the name of the list of data.frames to df_list to avoid confusion)
df_list <- list(
data.frame(ID = 1, b = c('x', 'y', 'z'), c = c('y', 'z', 'x'), d = c('z', 'x', 'y')),
data.frame(ID = 1, b = c('x', 'y', 'z'), c = c('y', 'z', 'x'), d = c('z', 'x', 'y')),
data.frame(ID = 2, b = c('x', 'y', 'z'), c = c('y', 'z', 'x'), d = c('z', 'x', 'y'))
)
for (i in 1:(length(df_list) - 1)) {
if (NROW(df_list[[i]]) == NROW(df_list[[i 1]]) &&
all(df_list[[i]]$ID == df_list[[i 1]]$ID)) {
df_list[[i]] <- cbind(df_list[[i]], df_list[[i 1]][, -1])
df_list[[i 1]] <- list()
}
}
df_list <- df_list[!sapply(df_list, function(x) NROW(x) == 0)]
df_list
[[1]]
ID b c d b c d
1 1 x y z x y z
2 1 y z x y z x
3 1 z x y z x y
[[2]]
ID b c d
1 2 x y z
2 2 y z x
3 2 z x y