I would like some help to interpret the generated result. I wanted to understand specifically what those results in min_distance
mean. By any chance, are the minimum distance between properties?
database<-structure(list(Latitude = c(-24.781624, -24.775017, -24.769196,
-24.761741, -24.752019, -24.748008, -24.737312, -24.744718, -24.751996,
-24.724589), Longitude = c(-49.937369,
-49.950576, -49.927608, -49.92762, -49.920608, -49.927707, -49.922095,
-49.915438, -49.910843, -49.899478)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
d<-distm(database[,2:1])
diag(d)<-1000000
min_distance<-as.matrix(apply(d,MARGIN=2,FUN=min))
> min_distance
[,1]
[1,] 1522.9967
[2,] 1522.9967
[3,] 825.7868
[4,] 825.7868
[5,] 844.4219
[6,] 844.4219
[7,] 1061.3607
[8,] 930.5737
[9,] 930.5737
[10,] 2687.3265
CodePudding user response:
If you print your output you will have a better sense of what you are looking at.
database<-structure(list(Latitude = c(-24.781624, -24.775017, -24.769196,
-24.761741, -24.752019, -24.748008, -24.737312, -24.744718, -24.751996,
-24.724589), Longitude = c(-49.937369,
-49.950576, -49.927608, -49.92762, -49.920608, -49.927707, -49.922095,
-49.915438, -49.910843, -49.899478)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
database
# A tibble: 10 x 2
# Latitude Longitude
# <dbl> <dbl>
# 1 -24.8 -49.9
# 2 -24.8 -50.0
# 3 -24.8 -49.9
# 4 -24.8 -49.9
# 5 -24.8 -49.9
# 6 -24.7 -49.9
# 7 -24.7 -49.9
# 8 -24.7 -49.9
# 9 -24.8 -49.9
# 10 -24.7 -49.9
This is a table of 10 locations, with their respective latitude and longitudes.
d<-geosphere::distm(database[,2:1])
diag(d)<-1000000
d
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1000000.000 1522.997 1693.9978 2413.0564 3691.5733 3849.7209 5145.795 4651.0675 4238.9045 7389.402
# [2,] 1522.997 1000000.000 2410.7105 2748.2810 3959.3951 3781.6594 5073.723 4888.2849 4759.4651 7610.377
# [3,] 1693.998 2410.711 1000000.0000 825.7868 2030.1461 2347.0014 3575.519 2977.7558 2550.5461 5701.861
# [4,] 2413.056 2748.281 825.7868 1000000.0000 1289.4746 1521.2196 2763.090 2252.5415 2011.1860 5003.997
# [5,] 3691.573 3959.395 2030.1461 1289.4746 1000000.0000 844.4219 1636.011 963.0860 987.7503 3714.976
# [6,] 3849.721 3781.659 2347.0014 1521.2196 844.4219 1000000.0000 1313.776 1293.4863 1762.1203 3858.080
# [7,] 5145.795 5073.723 3575.5193 2763.0905 1636.0110 1313.7763 1000000.000 1061.3607 1985.2383 2687.327
# [8,] 4651.067 4888.285 2977.7558 2252.5415 963.0860 1293.4863 1061.361 1000000.0000 930.5737 2752.885
# [9,] 4238.905 4759.465 2550.5461 2011.1860 987.7503 1762.1203 1985.238 930.5737 1000000.0000 3246.261
# [10,] 7389.402 7610.377 5701.8609 5003.9971 3714.9761 3858.0802 2687.327 2752.8847 3246.2607 1000000.000
This is the distance matrix of each location to each other location. You have assigned a large number to the diagonal, presumably so the next part works.
min_distance<-as.matrix(apply(d,MARGIN=2,FUN=min))
# r$> min_distance
# [,1]
# [1,] 1522.9967
# [2,] 1522.9967
# [3,] 825.7868
# [4,] 825.7868
# [5,] 844.4219
# [6,] 844.4219
# [7,] 1061.3607
# [8,] 930.5737
# [9,] 930.5737
# [10,] 2687.3265
This is the key line. When you use apply
with MARGIN
of 2
on a matrix it iterates over columns, although in this case it doesn't matter as the matrix is symmetrical. If you change the MARGIN
to 1
the output will be the same.
You are asking R to tell you the lowest value in every column. In column 1, this is the distance between locations 1 and 2, and in column 2 it is the same. The next two are location 3 to location 4. And so on. Whether this is useful depends on what the question you are asking is.