Home > OS >  update multiple rows based on conditions
update multiple rows based on conditions

Time:10-15

I would like to update three columns simutaneously based on one column

My data looks like this

df <- data.frame(input = c("Antidesma cuspidatum Mull.Arg.", "Antidesma cuspidatum Müll.Arg.", 
                  "Alchornea parviflora (Benth.) Mull.Arg.", "Alchornea parviflora (Benth.) Müll.Arg."),
                 n1 = c("Antidesma cuspidatum", NA, "Alchornea parviflora", NA),
                 n2 = c("Antidesma", NA, "Alchornea", NA),
                 n3 = c("Phyllanthaceae", NA, "Euphorbiaceae", NA))

                                    input                   n1        n2             n3
1          Antidesma cuspidatum Mull.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
2          Antidesma cuspidatum Müll.Arg.                 <NA>      <NA>           <NA>
3 Alchornea parviflora (Benth.) Mull.Arg. Alchornea parviflora Alchornea  Euphorbiaceae
4 Alchornea parviflora (Benth.) Müll.Arg.                 <NA>      <NA>           <NA>

I would like to ask if I find the first two strings of input column are the same , then the coresponding rows would be the same. It means that the value (2nd and 4th rows) of n1, n2, n3 in this example would be added by the value (1st and 3rd rows).

My desired output here

                                    input                   n1        n2             n3
1          Antidesma cuspidatum Mull.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
2          Antidesma cuspidatum Müll.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
3 Alchornea parviflora (Benth.) Mull.Arg. Alchornea parviflora Alchornea  Euphorbiaceae
4 Alchornea parviflora (Benth.) Müll.Arg. Alchornea parviflora Alchornea  Euphorbiaceae

Any sugesstions for me this case?

CodePudding user response:

You can use the dplyr package. First I create a column gr which contains only the first two strings of input. Then I change (or mutate) the columns n1, n2 and n3 by putting the non-NA value of that group there.

library(dplyr)

df %>%
  group_by(gr = gsub("(^\\w  \\w ) .*", "\\1", input)) %>%
  mutate(across(c(n1, n2, n3), ~.x[!is.na(.x)][1])) %>%
  ungroup()
  • Related