Home > Software design >  Merging 2 dataframes (when columns are different)
Merging 2 dataframes (when columns are different)

Time:08-08

I am trying to merge 2 data frames, 1 of which contains latitude and longitude. Im not sure if/how this would be possible but basically what I wish to do if add the latitude, longitude and evevation to each cities in df1 but was unsuccessful in doing so as the dfs dont match up.

df1

  MaxTemp[,1:6]
  # A tibble: 366 × 6
         Date Machrihanish High_Wycombe Camborne Dun_Fell Plymouth
        <dbl>        <dbl>        <dbl>    <dbl>    <dbl>    <dbl>
   1 20200101          8.5          6.9      9.6      3.3      9.9
   2 20200102         11.7          9.1     11.2      5       10.9
   3 20200103          9.1          9.9     11.2      5.1     11.1
   4 20200104          9.2          8.1      9.4      2.2      9.4
   5 20200105         11.7          7.6      9        4.3      9.3
   6 20200106         10.8          8       11.6      3.7     10.6
   7 20200107         14.7         11.7     12        6.7     11.5
   8 20200108         11.2         11.8     11.6      6.2     11.3
   9 20200109          7           12       11.6     -0.2     11.5
  10 20200110          9.3          7.4     10        0       10.1

df 2

metadata
# A tibble: 20 × 4
   Location     Longitude Latitude Elevation
   <chr>            <dbl>    <dbl>     <dbl>
 1 Machrihanish    -5.70      55.4        10
 2 High_Wycombe    -0.807     51.7       204
 3 Camborne        -5.33      50.2        87
 4 Dun_Fell        -2.45      54.7       847
 5 Plymouth        -4.12      50.4        50

CodePudding user response:

Here is a solution that tidies the data to long format by location and day, and merges the lat / long information.

Using data provided in the original post, we read it into two data frames.

tempText <- "rowId Date Machrihanish High_Wycombe Camborne Dun_Fell Plymouth
     1 20200101          8.5          6.9      9.6      3.3      9.9
2 20200102         11.7          9.1     11.2      5       10.9
3 20200103          9.1          9.9     11.2      5.1     11.1
4 20200104          9.2          8.1      9.4      2.2      9.4
5 20200105         11.7          7.6      9        4.3      9.3
6 20200106         10.8          8       11.6      3.7     10.6
7 20200107         14.7         11.7     12        6.7     11.5
8 20200108         11.2         11.8     11.6      6.2     11.3
9 20200109          7           12       11.6     -0.2     11.5
10 20200110          9.3          7.4     10        0       10.1"

library(tidyr)
library(dplyr)

temps <- read.table(text = tempText,header = TRUE)

latLongs <-"rowId   Location     Longitude Latitude Elevation
 1 Machrihanish    -5.70      55.4        10
 2 High_Wycombe    -0.807     51.7       204
 3 Camborne        -5.33      50.2        87
 4 Dun_Fell        -2.45      54.7       847
 5 Plymouth        -4.12      50.4        50"

latLongs <- read.table(text = latLongs,header = TRUE)

Next, we use tidyr::pivot_longer() to generate long format data, and then merge it with the lat long data via dplyr::full_join().

temps %>%
     select(-rowId) %>%
     pivot_longer(.,Machrihanish:Plymouth,names_to = "Location",  values_to="MaxTemp") %>%
     full_join(.,latLongs) %>% select(-rowId) -> joinedData

head(joinedData)

...and the first few rows of joined output looks like this:

> head(joinedData)
# A tibble: 6 × 6
      Date Location     MaxTemp Longitude Latitude Elevation
     <int> <chr>          <dbl>     <dbl>    <dbl>     <int>
1 20200101 Machrihanish     8.5    -5.7       55.4        10
2 20200101 High_Wycombe     6.9    -0.807     51.7       204
3 20200101 Camborne         9.6    -5.33      50.2        87
4 20200101 Dun_Fell         3.3    -2.45      54.7       847
5 20200101 Plymouth         9.9    -4.12      50.4        50
6 20200102 Machrihanish    11.7    -5.7       55.4        10
>
  •  Tags:  
  • r
  • Related