I have imported an xlsx document into R, and I have found several duplicates in the document. When I try to delete those duplicates using !duplicated function, it keeps giving me the following error:
Error: Must subset columns with a valid subscript vector.
ℹ Logical subscripts must match the size of the indexed input.
x Input has size 30 but subscript !duplicated(export)
has size 33376.
Below is the code I have so far:
cb<-read.csv("120Water Request_KE.csv")
export <- read_xlsx("Anderson, IN _ Ziptility Export_KE.xlsx")
cb<-clean_names(cb)
export<-clean_names(export)
export <- export[!duplicated[[export, ]
Thank you
CodePudding user response:
I think what you are looking for is:
export <- export[!duplicated(export),]
or
library(tidyverse)
export <- export %>%
distinct(., .keep_all = TRUE)
CodePudding user response:
This is a simple syntax error.
It's difficult to rewrite your code without having access to your dataset, but the syntax you're looking for is probably this:
export[!duplicated(export, by = "column_name"),]
Just change your square brackets to round ones and specify which column you want it to check.