Preface: I am a beginner to R that is eager to learn. Please don't mistake the simplicity of the question (if it is a simple answer) for lack of research or effort!
Here is a look at the data I am working with:
year state age POP
1: 90 1001 0 239
2: 90 1001 0 203
3: 90 1001 1 821
4: 90 1001 1 769
5: 90 1001 2 1089
The state column contains the FIPS codes for all states. For the purpose of merging, I need the state column to match my another dataset. To achieve this task, all I have to do is omit the last two numbers for each FIPS code such that the table looks like this:
year state age POP
1: 90 10 0 239
2: 90 10 0 203
3: 90 10 1 821
4: 90 10 1 769
5: 90 10 2 1089
I can't figure out how to accomplish this task on a numeric column. Substr() makes this easy on a character column.
CodePudding user response:
In case your number is not always 4 digits long, to omit the last two you can make use of the vectorized behavior of substr()
x <- rownames(mtcars)[1:5]
x
#> [1] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710"
#> [4] "Hornet 4 Drive" "Hornet Sportabout"
substr(x, 1, nchar(x)-2)
#> [1] "Mazda R" "Mazda RX4 W" "Datsun 7" "Hornet 4 Dri"
#> [5] "Hornet Sportabo"
# dummy code for inside a data.table
dt[, x_new := substr(x, 1, nchar(x)-2)]
CodePudding user response:
Just for generalizing this in the instance when you might have a very large numeric column, and need to substr
it correctly. (Which is probably a good argument for storing/importing it as a character column to start with, but it's an imperfect world...)
x <- c(10000000000, 1000000000, 100000000, 10000000, 1000000,100000,10000,1000,100)
substr(x, 1, nchar(x)-2 )
#[1] "1e " "1e " "1e " "1e " "1e " "1e " "100" "10" "1"
as.character(x)
#[1] "1e 10" "1e 09" "1e 08" "1e 07" "1e 06" "1e 05" "10000" "1000"
#[9] "100"
xsf <- sprintf("%.0f", x)
substr(xsf, 1, nchar(xsf)-2)
#[1] "100000000" "10000000" "1000000" "100000" "10000"
#[6] "1000" "100" "10" "1"
cbind(x, xsf, xsfsub=substr(xsf, 1, nchar(xsf)-2) )
# x xsf xsfsub
# [1,] "1e 10" "10000000000" "100000000"
# [2,] "1e 09" "1000000000" "10000000"
# [3,] "1e 08" "100000000" "1000000"
# [4,] "1e 07" "10000000" "100000"
# [5,] "1e 06" "1000000" "10000"
# [6,] "1e 05" "100000" "1000"
# [7,] "10000" "10000" "100"
# [8,] "1000" "1000" "10"
# [9,] "100" "100" "1"