Context: I'm trying to write a parser in R for the track files exported by my preferred GPS app. The files use a custom binary specification, with latitude, longitude, and timestamps all represented as 8-byte, big-endian, signed integers. For example, latitude is degrees north x10^7. This is the first time I've messed around with parsing raw/hex representations.
Let's say I have 3 raw integers:
# Should parse as 377441228
lat = as.raw(c(0x00, 0x00, 0x00, 0x00, 0x16, 0x7f, 0x4b, 0xcc))
# Should parse as -1195899101
lon = as.raw(c(0xff, 0xff, 0xff, 0xff, 0xb8, 0xb8, 0x07, 0x23))
# Should parse as 1618678057000
time = as.raw(c(0x00, 0x00, 0x01, 0x78, 0xe0, 0xbb, 0x08, 0x28))
The first approach I found was to use readBin()
. This works correctly for lat
and lon
but not time
:
# 377441228: correct
readBin(lat, integer(), size = 8,
signed = TRUE, endian = 'big')
# -1195899101: correct
readBin(lon, integer(), size = 8,
signed = TRUE, endian = 'big')
# -524613592: incorrect
readBin(time, integer(), size = 8,
signed = TRUE, endian = 'big')
The next approach was to do some string wrangling and pass through as.numeric()
. This worked for lat
and time
, but not lon
:
library(magrittr)
parser = function(hex) {
hex |>
paste(collapse = '') %>%
paste0('0x', .) |>
as.numeric()
}
# 377441228: correct
parser(lat)
# 1.844674e 19: incorrect
parser(lon)
# 1.618678e 12: correct
parser(time)
How do I parse these?
CodePudding user response:
You can use this little function which uses only base R. It converts the raw data into bits, orders these into a single big-endian vector of 1s and 0s, then uses their two's complement representation to convert them to the appropriate value.
parser <- function(x) {
bits <- sapply(x, function(y) rev(as.integer(rawToBits(y))))
sum(bits[-1] * 2^(62:0)) - bits[1] * 2^63
}
Testing, we have:
lat <- as.raw(c(0x00, 0x00, 0x00, 0x00, 0x16, 0x7f, 0x4b, 0xcc))
lon <- as.raw(c(0xff, 0xff, 0xff, 0xff, 0xb8, 0xb8, 0x07, 0x23))
time <- as.raw(c(0x00, 0x00, 0x01, 0x78, 0xe0, 0xbb, 0x08, 0x28))
parser(lat)
#> [1] 377441228
parser(lon)
#> [1] -1195898880
parser(time)
#> [1] 1.618678e 12
If you prefer a vectorized version that will handle multiple values at once, you can do:
parser <- function(x) {
sapply(x, function(z) {
bits <- sapply(z, function(y) rev(as.integer(rawToBits(y))))
sum(bits[-1] * 2^(62:0)) - bits[1] * 2^63
})
}
parser(list(lat, lon, time))
#> [1] 377441228 -1195898880 1618678057000
Created on 2023-01-01 with reprex v2.0.2