I would like to update three columns simutaneously based on one column
My data looks like this
df <- data.frame(input = c("Antidesma cuspidatum Mull.Arg.", "Antidesma cuspidatum Müll.Arg.",
"Alchornea parviflora (Benth.) Mull.Arg.", "Alchornea parviflora (Benth.) Müll.Arg."),
n1 = c("Antidesma cuspidatum", NA, "Alchornea parviflora", NA),
n2 = c("Antidesma", NA, "Alchornea", NA),
n3 = c("Phyllanthaceae", NA, "Euphorbiaceae", NA))
input n1 n2 n3
1 Antidesma cuspidatum Mull.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
2 Antidesma cuspidatum Müll.Arg. <NA> <NA> <NA>
3 Alchornea parviflora (Benth.) Mull.Arg. Alchornea parviflora Alchornea Euphorbiaceae
4 Alchornea parviflora (Benth.) Müll.Arg. <NA> <NA> <NA>
I would like to ask if I find the first two strings
of input
column are the same , then the coresponding rows would be the same. It means that the value (2nd and 4th rows) of n1
, n2
, n3
in this example would be added by the value (1st and 3rd rows).
My desired output here
input n1 n2 n3
1 Antidesma cuspidatum Mull.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
2 Antidesma cuspidatum Müll.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
3 Alchornea parviflora (Benth.) Mull.Arg. Alchornea parviflora Alchornea Euphorbiaceae
4 Alchornea parviflora (Benth.) Müll.Arg. Alchornea parviflora Alchornea Euphorbiaceae
Any sugesstions for me this case?
CodePudding user response:
You can use the dplyr
package.
First I create a column gr
which contains only the first two strings of input
. Then I change (or mutate
) the columns n1
, n2
and n3
by putting the non-NA value of that group there.
library(dplyr)
df %>%
group_by(gr = gsub("(^\\w \\w ) .*", "\\1", input)) %>%
mutate(across(c(n1, n2, n3), ~.x[!is.na(.x)][1])) %>%
ungroup()