order values in a dataframe by column names-CodePudding

I have a dataframe that looks something like this:

P1 <- c('P1 -> (Normal,_)', 'P1 -> (Normal,_)', 'NA', 'P5 -> (UP,_)')
P2 <- c('P4 -> (UP,_)', 'NA', 'P2 -> (UP,_)', 'P4 -> (UP,_)')
P3 <- c('P2 -> (UP,_)', 'P3 -> (UP,_)', 'P1 -> (UP,_)', 'P2 -> (UP,_)')
P4 <- c('NA', 'P4 -> (UP,_)', 'P3 -> (UP,_)', 'P3 -> (UP,_)')
P5 <- c('P3 -> (UP,_)', 'NA', 'NA', 'NA')

df <- data.frame(P1, P2, P3, P4, P5)

I need it to be ordered in a way that P1 column contains only P1 values, P2 column - P2 values, etc. If there is no value for that column, it should contain 'NA'.

So, the resulting dataframe should look like this:

CodePudding user response：

Having "NA" instead of NA probably isn't wise, but you can do this with a bit of indexing after matching up the P1/2/3/4/5 stem with the variable name:

sel <- df != "NA"           ## use is.na(df) instead if data is actually NA
out <- replace(df, , "NA")  ## use NA not "NA" if want an actual NA
out[ cbind(row(df)[sel], match(substr(df[sel],1,2), names(df)) ) ] <- df[sel]
out

#                P1           P2           P3           P4           P5
#1 P1 -> (Normal,_) P2 -> (UP,_) P3 -> (UP,_) P4 -> (UP,_)           NA
#2 P1 -> (Normal,_)           NA P3 -> (UP,_) P4 -> (UP,_)           NA
#3     P1 -> (UP,_) P2 -> (UP,_) P3 -> (UP,_)           NA           NA
#4               NA P2 -> (UP,_) P3 -> (UP,_) P4 -> (UP,_) P5 -> (UP,_)

CodePudding user response：

In this code I use 5 different ways, one for each column, but all of them made the same thing. You could choose one and the replicate for other columns.

Maybe could be usefull ;)

    library(grepl)
    library(stringr)
    library(dplyr)
    df %>% mutate(P1 = ifelse(P1 == "P1 -> (Normal,_)", P1, NA),
                  P2 = ifelse(P2 != "P2 -> (UP,_)", NA, P2),
                  P3 = ifelse(grepl("P3", P3), P3, NA),
                  P4 = ifelse(str_starts(P4, "P4"), P4, NA),
                  P5 = ifelse(str_detect(P5, "P5"), P5, NA))