I have a dataset with a column for IDs. Few of these IDs are not integers but have decimals points. The dataset is very large, so I cannot just skim through. Here an example:
ID <- c(1, 2, 3, 4, 5.19, 6, 7, 8, 9, 10.732)
I would like to return the value and/or position of those with decimals. So something like:
[5] 5.19
[10] 10.732
Is this possible?
Many thanks in advance.
CodePudding user response:
Values,
ID[ID != as.integer(ID)]
# [1] 5.190 10.732
positions,
which(ID != as.integer(ID))
# [1] 5 10
and altogether.
setNames(ID, seq_along(ID))[ID != as.integer(ID)]
# 5 10
# 5.190 10.732
CodePudding user response:
You could consider the modulo operation and then comparing the results with 0:
dplyr::near(ID %% 1, 0)
that would return:
[1] TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE
Then, you can negate it and return the positions of non-integers:
which(!dplyr::near(ID %% 1, 0))
[1] 5 10
Alternatively, you could convert it to a character vector and then look for the decimal point:
which(grepl("[0-9]\\.[0-9]", as.character(ID)))
[1] 5 10