df
is a dataframe in which I need to group together the rows having identical elements in the Name
column. Finally the duplicated elements in the Name
column are to be removed.
df <- data.frame(Name = c("A","","","B","","","A","","","B","",""),
Test = c("test1","test2","test3","test1","test2","test3",
"test1.1","test2.1","test3.1","test1.1","test2.1","test3.1"))
Desired output:
> df
Name Test
1 A test1
2 test2
3 test3
4 test1.1
5 test2.1
6 test3.1
7 B test1
8 test2
9 test3
10 test1.1
11 test2.1
12 test3.1
CodePudding user response:
You could try the following with tidyverse
. Replace empty character values with NA
and fill
down with the Name
value. Then, sort by Name
. Finally, keep only the first Name
in a group.
library(tidyverse)
df %>%
mutate(Name = na_if(Name, "")) %>%
fill(Name, .direction = "down") %>%
arrange(match(Name, unique(df$Name))) %>%
group_by(Name) %>%
mutate(Name = ifelse(row_number() == 1, Name, ""))
Output
Name Test
<chr> <chr>
1 "A" test1
2 "" test2
3 "" test3
4 "" test1.1
5 "" test2.1
6 "" test3.1
7 "B" test1
8 "" test2
9 "" test3
10 "" test1.1
11 "" test2.1
12 "" test3.1
CodePudding user response:
Here is one option with na.locf
and arrange
library(dplyr)
library(zoo)
df %>%
arrange(na.locf(na_if(Name, ""))) %>%
mutate(Name = replace(Name, duplicated(Name) & Name != "", ""))
-output
Name Test
1 A test1
2 test2
3 test3
4 test1.1
5 test2.1
6 test3.1
7 B test1
8 test2
9 test3
10 test1.1
11 test2.1
12 test3.1