Home > Net >  Merging of two datasets of different lenghts
Merging of two datasets of different lenghts

Time:09-28

I'm trying to merge two datasets I have.

df1:

day month year lon lat month-year
3 5 2009 5.7 53.9 May 2009
8 9 2004 6.9 52.6 Sep 2004
15 9 2004 3.8 50.4 Sep 2004
5 5 2009 2.7 51.2 May 2009
28 7 2005 14.8 62.4 Jul 2005
18 9 2004 5.1 52.5 Sep 2004

df2:

nao-value sign month-year
- 2.1 Negative Sep 2004
1.3 Positive Jul 2005
- 1.1 Negative May 2009

I want to merge this to add the NAO value for each month and year in the occurrence data, meaning i want the NAO value for each specific month repeated for all registrations of that month in the occurrece data.

Problem is I cannot get the NAO values to line up where it should by the occurrence data, its either placed just repetitive and not alligned with the date it should, given as month-year.x and month-year.y ,or it is given back as NA value.

I have tried a few different approaches:

df3 <- merge(df1, df2, by="month-year")

df3 <- merge(cbind(df1, X=rownames(df1)), cbind(df2, variable=rownames(df2)))

df3 <- merge(df1,df2, by ="month-year", all.x = TRUE,all.y=TRUE, sort = FALSE)

df3 <- merge(df1, df2, by=intersect(df1$month-year(df1), df2$month-year(df2)))

But not of those give the result I desire.

Help is highly appreciated

Edit to include dput:

dput(head(df1, 10)) :

structure(list(Day = c(29, 2, 14, 31, 16, 7, 25, 12, 21, 22), 
Month = c(7, 7, 7, 8, 8, 7, 8, 6, 6, 9), Year = c(2010, 2015, 
2010, 2018, 2016, 2018, 2019, 2004, 2015, 2019), Lon = c(-6.155014, 
-5.820868, -5.509842, -5.495277, -5.469389, -5.469389, -5.469389, 
-5.466995, -5.461942, -5.457127), Lat = c(59.09478, 59.125228, 
57.959196, 57.96022, 57.986825, 57.986825, 57.986825, 57.874527, 
57.95972, 58.07697), Date = c("Jul 2010", "Jul 2015", "Jul 2010", 
"Aug 2018", "Aug 2016", "Jul 2018", "Aug 2019", "Jun 2004", 
"Jun 2015", "Sep 2019")), row.names = c(NA, -10L), class = 
c("tbl_df", 
"tbl", "data.frame"))

dput(head(df2, 10)) :

structure(list(NAO = c(1.04, 1.41, 1.46, 2, -1.53, -0.02, 0.53, 
0.97, 1.06, 0.23), Sign = c("Positive", "Positive", "Positive", 
"Positive", "Negative", "Negative", "Positive", "Positive", 
"Positive", 
"Positive"), Date = c("jan 1990", "feb 1990", "mar 1990", "apr 1990", 
"mai 1990", "jun 1990", "jul 1990", "aug 1990", "sep 1990", "okt 
1990"
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))

CodePudding user response:

merge function is case sensitive. You have different cases in two dataframes that you are merging. Make the case in both the dataframe same and then perform the merge. Try -

result <- merge(transform(df1, Date = tolower(Date)), df2, by = 'Date')

CodePudding user response:

Using tidyverse

library(dplyr)
df1 %>% 
    mutate(Date = tolower(Date)) %>%
    inner_join(df2, by = 'Date')
  • Related