Home > Back-end >  Replacing strings in dataframe with column name in R
Replacing strings in dataframe with column name in R

Time:08-26

I want to replace strings in an R dataframe. The dataframe shows per production order (rows) when a resource / production step (columns) was utilized. For this particular analysis the time values are not needed, instead I want to have the column name in place of the timestamp.

The data looks something like this

df_current <- data.frame(
    Prod.order = seq(123, 127),
    B100 = c("01:00:00", "02:00:00", "03:00:00", "04:00:00", "05:00:00"),
    `B100 (2)` = c(NA, NA, "06:00:00", "07:00:00", NA),
    D200 = c("02:00:00", NA, NA, NA, "06:00:00"),
    D300 = c(NA, NA, "04:00:00", "05:00:00", "07:00:00"),
    check.names = FALSE)

And i want it to look like this. (Also i want to remove the NA's but this isnt a problem)

 df_desired <- data.frame(
    Prod.order = seq(123, 127),
    B100 = c("B100", "B100", "B100", "B100", "B100"),
    `B100 (2)` = c("", "", "B100 (2)", "B100 (2)", ""),
    D200 = c("D200", "", "", "", "D200"),
    D300 = c("", "", "D300", "D300", "D300"),
    check.names = FALSE)

It seems so simple but I havn't been able figure it out. p.s. it would be amazing if the solution fits in a dplyr pipline ;)

Thanks :D

CodePudding user response:

A base solution with names(df)[col(df)] replicating the column names:

df_current[-1] <- ifelse(is.na(df_current), '', names(df_current)[col(df_current)])[, -1]
df_current

#   Prod.order B100 B100 (2) D200 D300
# 1        123 B100          D200
# 2        124 B100
# 3        125 B100 B100 (2)      D300
# 4        126 B100 B100 (2)      D300
# 5        127 B100          D200 D300

or with t():

df_current[-1] <- t(ifelse(t(is.na(df_current)), '', names(df_current)))[, -1]
df_current

CodePudding user response:

library(tidyverse)

df_current %>%
  mutate(across(where(is.character), ~ifelse(is.na(.x), '', cur_column())))

  Prod.order B100 B100..2. D200 D300
1        123 B100          D200     
2        124 B100                   
3        125 B100 B100..2.      D300
4        126 B100 B100..2.      D300
5        127 B100          D200 D300

df_current %>%
   pivot_longer(where(is.character)) %>%
   mutate(value = ifelse(is.na(value), '', name)) %>%
   pivot_wider()

# A tibble: 5 x 5
  Prod.order B100  B100..2.   D200   D300  
       <int> <chr> <chr>      <chr>  <chr> 
1        123 B100  ""         "D200" ""    
2        124 B100  ""         ""     ""    
3        125 B100  "B100..2." ""     "D300"
4        126 B100  "B100..2." ""     "D300"
5        127 B100  ""         "D200" "D300"
  • Related