How to update columns based on another column-CodePudding

I would like to know how to update the information of one column based on another column

My example looks like

df <- data.frame(input = c("Teline stenopetala", "Teline stenopetala", 
                           "Teline stenopetala", "Prunus lusitanica", "Prunus lusitanica"),
                 n1 = c("Genista stenopetala Webb & Berthel.", NA, NA, NA, "Prunus hixa Brouss ex. Willd."))

The df includes the input column, and the 'n1' column whose values correspond to the input column's value.

I would want to call in the event that I find any duplicated rows in the input column, the values that are blank in n1 will be modified according to the value that is already present in n1 associated with the value of the input column.

Let's say the data is big so I would not want to call specific-name e.g. using df %>% mutate(n1 = case_when(input == "Teline stenopetala" ~ "..."

My desired output here:

df <- data.frame(input = c("Teline stenopetala", "Teline stenopetala", 
                           "Teline stenopetala", "Prunus lusitanica", "Prunus lusitanica"),
                 n1 = c("Genista stenopetala Webb & Berthel.", "Genista stenopetala Webb & Berthel.", "Genista stenopetala Webb & Berthel.", "Prunus hixa Brouss ex. Willd.", "Prunus hixa Brouss ex. Willd."))

CodePudding user response：

You can group_by and fill downup as also suggested by @margusl, like this:

library(tidyverse)
df %>%
  group_by(input) %>%
  fill(n1, .direction = "downup")

Output:

# A tibble: 5 × 2
# Groups:   input [2]
  input              n1                                 
  <chr>              <chr>                              
1 Teline stenopetala Genista stenopetala Webb & Berthel.
2 Teline stenopetala Genista stenopetala Webb & Berthel.
3 Teline stenopetala Genista stenopetala Webb & Berthel.
4 Prunus lusitanica  Prunus hixa Brouss ex. Willd.      
5 Prunus lusitanica  Prunus hixa Brouss ex. Willd.

CodePudding user response：

Try this

df <- data.frame(input = c("Teline stenopetala", "Teline stenopetala", 
                           "Teline stenopetala", "Prunus lusitanica", "Prunus lusitanica"),
                 n1 = c("Genista stenopetala Webb & Berthel.", NA, NA, NA, "Prunus hixa Brouss ex. Willd."))

library(dplyr , warn.conflicts = FALSE)
df %>% group_by(input) %>% 
  mutate(n1 = n1[which(!is.na(n1))])

#> # A tibble: 5 × 2
#> # Groups:   input [2]
#>   input              n1                                 
#>   <chr>              <chr>                              
#> 1 Teline stenopetala Genista stenopetala Webb & Berthel.
#> 2 Teline stenopetala Genista stenopetala Webb & Berthel.
#> 3 Teline stenopetala Genista stenopetala Webb & Berthel.
#> 4 Prunus lusitanica  Prunus hixa Brouss ex. Willd.      
#> 5 Prunus lusitanica  Prunus hixa Brouss ex. Willd.

^{Created on 2022-06-12 by the reprex package (v2.0.1)}