For example, i have the following data.
ID
74019559952254665
74019229952254665
74019889952254665
74020209952254665
74020229952254665
i want to extract any digit contains the year from 1922 to 2022, please note that the starting number for any digit is fixed 740 and the forth numbers after it is the year of birth and the rest of number is a random numbers. for example if i have the ID 74018509952254665 should be ignored, because 1850 not in my range.
CodePudding user response:
Assuming the ID
column to be integers which would always have the same width of 17 digits, we can use integer division and the modulus here:
df[((df$ID %/% 10000000000) %% 10000) %in% c(1922:2002), ]
CodePudding user response:
You can extract by position with substr
, and then filter the appropriate years.
df[as.numeric(substr(df$ID, 4, 7)) %in% 1922:2022, ]