I have a data frame (jan2020
) of bicycle services, where the origin and destination stations are mentioned as follows:
Origen_Id Destiny_Id #data frame A
2 52
93 10
In another data frame (stations
), I have the station's Id, name, latitude and longitude, and other information.
id Nombre Latitude Longitude
2 Name1 20.666378 -103.34882
3 Name2 20.667228 -103.366
4 Name3 20.66769 -103.368252
I would like to create a data frame like this,
Origen_Id Destiny_Id Lat_Orig Lon_Orig Lat_Des Long_Des
2 52 20.666378 -103.34882 some number some number
I tried to do it with a for-loop
but it's not efficient, another thought was to
jan2020$Index_Id <- match(jan2020$Origen_Id, stations$id)
However, this only tells me the index of the station in the data frame stations
, and I'm not sure how to create a new column with the information I need.
CodePudding user response:
Let two data as df1
and df2
.
df1 <- read.table(text = "Origen_Id Destiny_Id
2 4
3 2 ", header = T)
Origen_Id Destiny_Id
1 2 4
2 3 2
df2 <- read.table(text = "id Nombre Latitude Longitude
2 Name1 20.666378 -103.34882
3 Name2 20.667228 -103.366
4 Name3 20.66769 -103.368252", header = T)
id Nombre Latitude Longitude
1 2 Name1 20.66638 -103.3488
2 3 Name2 20.66723 -103.3660
3 4 Name3 20.66769 -103.3683
Then , desired result df3
may done by
df3 <- cbind(df1, df2[match(df1$Origen_Id, df2$id),3:4], df2[match(df1$Destiny_Id, df2$id),3:4])
names(df3) <- c("Origen_Id", "Destiny_Id", "Lat_Orig", "Lon_Orig", "Lat_Des", "Long_Des")
df3
Origen_Id Destiny_Id Lat_Orig Lon_Orig Lat_Des Long_Des
1 2 4 20.66638 -103.3488 20.66769 -103.3683
2 3 2 20.66723 -103.3660 20.66638 -103.3488
CodePudding user response:
Merge it twice, once on Origen_Id then on Destiny_Id with suffixes:
merge(merge(df1, df2, by.x = "Origen_Id", by.y = "id"),
df2, by.x = "Destiny_Id", by.y = "id", suffixes = c("Ori", "Des"))
# Destiny_Id Origen_Id NombreOri LatitudeOri LongitudeOri NombreDes LatitudeDes LongitudeDes
# 1 2 3 Name2 20.66723 -103.3660 Name1 20.66638 -103.3488
# 2 4 2 Name1 20.66638 -103.3488 Name3 20.66769 -103.368
(using Park's data)