Place strings of column in another column in R-CodePudding

I have a list of dfs. Here is an example of one df:

   `Basics Chest` Anatomie                                Atlas                   
   <lgl>          <chr>                                   <chr>                   
 1 NA             NA                                      Xray                    
 2 NA             NA                                      CT                      
 3 NA             NA                                      PET-CT                  
 4 NA             CT Protokolle Chest Standard            NA

Now I want to take the header of the first column - in this case "Basics Chest" and put it after the strings of the following columns like this:

   `Basics Chest` Anatomie                                    Atlas                   
   <lgl>          <chr>                                       <chr>                   
 1 NA             NA                                          Xray - Basics Chest                   
 2 NA             NA                                          CT - Basics Chest                      
 3 NA             NA                                          PET-CT - Basics Chest                 
 4 NA             CT Protokolle Chest Standard - Basics Chest NA

As you can see, NA shouldn't be touched by this (have to keep them, so no filtering out of them in a prior step).

This should work for the whole list of my df with variable numbers of columns, as I am thinking about including this into a for loop. Any elegant solutions?

Kind regards

CodePudding user response：

If I understand what you're trying to do correctly, I think you're looking for the purrr library, which is part of the tidyverse, specifically the map() family of functions. This is one of the best tools to know if you're using R; it cleans up code tremendously and makes a lot of sense once you get used to it. It does, however, take a while to wrap your head around. It requires that you understand both lists and functions fairly well. However, the rewards to using purrr are substantial.

The map functions go through lists or vectors and apply a function to each element. I think there's a whole chapter on them in R for Data Science, which is free and highly recommended.

An important thing to be aware of here (you'll see this in step two below) is that a dataframe is essentially a list of vectors of the same length.

In the solution below:

I First generate dummy data (a list of data frames).
Write a function that grabs the name of the first column and then adds that text to every column in the dataframe.
Applies the function created in step two to the whole list of data frames.

Let me know if you have any questions or if I misunderstood anything.

#STEP 1: Create dummy data
df.list <- list (
  "first" = tibble(
    `name 1` = NA,
    a = c(letters[1:5], NA),
    b = c(LETTERS[1:4], NA, "HI!!")
  ),
  "second" = tibble(
    `name 2` = NA,
    d = c(letters[1:5], NA),
    e = c(LETTERS[1:4], NA, "HI!!")
  ),
  "third" = tibble(
    `name 3` = NA,
    f = c(letters[1:5], NA),
    g = c(LETTERS[1:4], NA, "HI!!")
  )
)

#STEP 2: Create function that will be applied to each data frame
add_first_col_name <- function (df) {

  
  first.name <- names(df)[1]
  
  #Note: the code below attaches the text to every column. This will turn any
  #non-text columns into text. Based on your example, I think this is okay
  #but let me know if not - there are extra steps that could solve this.
  
  df %>%
    map_df(~str_c(.x, " - ", first.name))
}

#STEP 3: Use map() to apply function to each data frame in the list
map(df.list, add_first_col_name)

CodePudding user response：

We can use an ifelse based on NA in Atlas to paste

df1$Atlas <- with(df1, ifelse(is.na(`Basics Chest`) & !is.na(Atlas), 
paste(Atlas, "- Basics Chest"), Atlas))

For multiple columns, just loop over the columns other than Atlas and do the same

df1[-1] <- lapply(df1[-1], \(x) ifelse(!is.na(x) & 
     is.na(df1[["Basics Chest"]]), paste(x, "- Basics Chest"), x))

Or with dplyr

library(dplyr)
library(stringr)
df1 <- df1 %>%
   mutate(across(-`Basics Chest`, 
   ~ case_when(!is.na(.x) & is.na(`Basics Chest`)
   ~ str_c(.x, ' - Basics Chest'))))

-output

df1
Basics Chest                                    Anatomie                 Atlas
1           NA                                        <NA>   Xray - Basics Chest
2           NA                                        <NA>     CT - Basics Chest
3           NA                                        <NA> PET-CT - Basics Chest
4           NA CT Protokolle Chest Standard - Basics Chest                  <NA>

data

df1 <- structure(list(`Basics Chest` = c(NA, NA, NA, NA), Anatomie = c(NA, 
NA, NA, "CT Protokolle Chest Standard"), Atlas = c("Xray", "CT", 
"PET-CT", NA)), class = "data.frame", row.names = c("1", "2", 
"3", "4"))