I have a list of dfs. Here is an example of one df:
`Basics Chest` Anatomie Atlas
<lgl> <chr> <chr>
1 NA NA Xray
2 NA NA CT
3 NA NA PET-CT
4 NA CT Protokolle Chest Standard NA
Now I want to take the header of the first column - in this case "Basics Chest" and put it after the strings of the following columns like this:
`Basics Chest` Anatomie Atlas
<lgl> <chr> <chr>
1 NA NA Xray - Basics Chest
2 NA NA CT - Basics Chest
3 NA NA PET-CT - Basics Chest
4 NA CT Protokolle Chest Standard - Basics Chest NA
As you can see, NA shouldn't be touched by this (have to keep them, so no filtering out of them in a prior step).
This should work for the whole list of my df with variable numbers of columns, as I am thinking about including this into a for loop. Any elegant solutions?
Kind regards
CodePudding user response:
If I understand what you're trying to do correctly, I think you're looking for the purrr library, which is part of the tidyverse, specifically the map() family of functions. This is one of the best tools to know if you're using R; it cleans up code tremendously and makes a lot of sense once you get used to it. It does, however, take a while to wrap your head around. It requires that you understand both lists and functions fairly well. However, the rewards to using purrr are substantial.
The map functions go through lists or vectors and apply a function to each element. I think there's a whole chapter on them in R for Data Science, which is free and highly recommended.
An important thing to be aware of here (you'll see this in step two below) is that a dataframe is essentially a list of vectors of the same length.
In the solution below:
- I First generate dummy data (a list of data frames).
- Write a function that grabs the name of the first column and then adds that text to every column in the dataframe.
- Applies the function created in step two to the whole list of data frames.
Let me know if you have any questions or if I misunderstood anything.
#STEP 1: Create dummy data
df.list <- list (
"first" = tibble(
`name 1` = NA,
a = c(letters[1:5], NA),
b = c(LETTERS[1:4], NA, "HI!!")
),
"second" = tibble(
`name 2` = NA,
d = c(letters[1:5], NA),
e = c(LETTERS[1:4], NA, "HI!!")
),
"third" = tibble(
`name 3` = NA,
f = c(letters[1:5], NA),
g = c(LETTERS[1:4], NA, "HI!!")
)
)
#STEP 2: Create function that will be applied to each data frame
add_first_col_name <- function (df) {
first.name <- names(df)[1]
#Note: the code below attaches the text to every column. This will turn any
#non-text columns into text. Based on your example, I think this is okay
#but let me know if not - there are extra steps that could solve this.
df %>%
map_df(~str_c(.x, " - ", first.name))
}
#STEP 3: Use map() to apply function to each data frame in the list
map(df.list, add_first_col_name)
CodePudding user response:
We can use an ifelse
based on NA
in Atlas
to paste
df1$Atlas <- with(df1, ifelse(is.na(`Basics Chest`) & !is.na(Atlas),
paste(Atlas, "- Basics Chest"), Atlas))
For multiple columns, just loop over the columns other than Atlas
and do the same
df1[-1] <- lapply(df1[-1], \(x) ifelse(!is.na(x) &
is.na(df1[["Basics Chest"]]), paste(x, "- Basics Chest"), x))
Or with dplyr
library(dplyr)
library(stringr)
df1 <- df1 %>%
mutate(across(-`Basics Chest`,
~ case_when(!is.na(.x) & is.na(`Basics Chest`)
~ str_c(.x, ' - Basics Chest'))))
-output
df1
Basics Chest Anatomie Atlas
1 NA <NA> Xray - Basics Chest
2 NA <NA> CT - Basics Chest
3 NA <NA> PET-CT - Basics Chest
4 NA CT Protokolle Chest Standard - Basics Chest <NA>
data
df1 <- structure(list(`Basics Chest` = c(NA, NA, NA, NA), Anatomie = c(NA,
NA, NA, "CT Protokolle Chest Standard"), Atlas = c("Xray", "CT",
"PET-CT", NA)), class = "data.frame", row.names = c("1", "2",
"3", "4"))