Home > Blockchain >  Creating new column based on cluster in R
Creating new column based on cluster in R

Time:07-26

Dear Stack overlow users...

I am struggling with using R. I did not frequently use it but used stata instead..

My data set has several clusters

Data figure

What I wanna do is making new cluster columns with the value so the clusters will be clolumn and each column has value.

Many thanks in advance

CodePudding user response:

If there exist equal number of values per cluster, using dummy data,

library(tidyverse)

df <- data.frame(
  value = rnorm(5),
  cluster = c(1:4, 4)
)
n = max(table(df$cluster))

for (i in unique(df$cluster)) {
  m = n - nrow(df[df$cluster == i,])
  if (m > 0){
    df <- rbind(df, setNames(as.data.frame(matrix(rep(c(NA, i),m), ncol = 2, byrow = T)), names(df)))
  }
}
df %>%
  group_by(cluster) %>%
  mutate(n = 1:n()) %>%
  pivot_wider(names_from = cluster, values_from = value) %>%
  select(-n)

      `1`    `2`    `3`    `4`
    <dbl>  <dbl>  <dbl>  <dbl>
1 -0.0549  0.250  0.618 -0.173
2 NA      NA     NA     -2.22 
  •  Tags:  
  • r
  • Related