Home > Back-end >  in R4.1.2 How to remove duplicate cells in a row leaving only the first cell
in R4.1.2 How to remove duplicate cells in a row leaving only the first cell

Time:01-03

How to remove a repeated duplicate cell in row, leaving only the first cell. (Remove the 2nd A3)

V1  V2  V3
A1  NA  C1
A2  NA  C2
A3  A3  C3
A4  NA  C4
A5  NA  C5
A6  NA  C6
A7  NA  C7
A8  NA  C8

my target

V1  V2  V3
A1  NA  C1
A2  NA  C2
A3  NA  C3
A4  NA  C4
A5  NA  C5
A6  NA  C6
A7  NA  C7
A8  NA  C8

CodePudding user response:

A possible solution:

library(tidyverse)

df <- data.frame(
  stringsAsFactors = FALSE,
  V1 = c("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8"),
  V2 = c(NA, NA, "A3", NA, NA, NA, NA, NA),
  V3 = c("C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8")
)

df %>% 
  rowwise %>% 
  mutate(V2 = if_else(V1 == V2, NA_character_, V2)) %>% ungroup

#> # A tibble: 8 × 3
#>   V1    V2    V3   
#>   <chr> <chr> <chr>
#> 1 A1    <NA>  C1   
#> 2 A2    <NA>  C2   
#> 3 A3    <NA>  C3   
#> 4 A4    <NA>  C4   
#> 5 A5    <NA>  C5   
#> 6 A6    <NA>  C6   
#> 7 A7    <NA>  C7   
#> 8 A8    <NA>  C8

CodePudding user response:

for(x in nrow(dataset)) 
{
if(dataset[x,2]%in           
  •  Tags:  
  • r
  • Related