Home > Net >  Melt/ reshape dataframe to combine columns and fill rows with NAs
Melt/ reshape dataframe to combine columns and fill rows with NAs

Time:10-15

Apologies that there is a wealth of information on this site about melting and reshaping data, however, I cannot find the answer to my question on any of the pages I've visited. I have a data set which looks something like:

A Year | A Mean Temp | A Max Temp | A Min Temp | B Year | B Mean Temp | B Max Temp | B Min Temp |

and I want to end up with

Year | A Mean Temp | A Max Temp | A Min Temp |B Mean Temp | B Max Temp | B Min Temp

and fill columns which don't have data for that specific year with 'NA'.

The desired output would be something like:

[Table][1]

I believe the answer lies somewhere in something like:

library(dplyr)
library(tidyr)
library(stringr)
Data %>%
  pivot_longer(cols = contains("Year"), names_to = c("Country", ".value"), 
               names_sep="_", values_drop_na = TRUE) %>%
  rename_with(~ str_c('Country_', .), Rating:Year)```

But as of yet no luck.

Any help would be appreciated.

Thank you

Data


structure(list(Antarctica.Year.CE = 167:172, Antarctica.Temp..C. = c(0.33, 
0.31, 0.18, 0.08, -0.01, -0.11), Antarctica.Min..C. = c(-1.24, 
-1.26, -1.39, -1.48, -1.57, -1.67), Antarctica.Max..C. = c(1.89, 
1.87, 1.74, 1.64, 1.55, 1.45), Arctic.Year.CE = 1:6, Arctic.Temp..C. = c(-1.15, 
-0.96, -0.32, 0.1, -0.18, -0.61), Arctic.Min..C. = c(-1.92, -1.76, 
-1.38, -0.74, -1.08, -1.17), Arctic.Max..C. = c(-0.31, -0.11, 
0.48, 0.83, 0.73, 0.16), Asia.Year.CE = 800:805, Asia.Temp..C. = c(-0.31, 
-0.14, -0.36, -0.67, -0.78, -0.26), Asia.Min..C. = c(-1.4, -1.23, 
-1.45, -1.76, -1.87, -1.35), Asia.Max..C. = c(0.79, 0.96, 0.74, 
0.43, 0.31, 0.83), Australasia.Year.CE = 1001:1006, Australasia.Temp..C. = c(-0.24, 
-0.38, -0.29, -0.33, -0.34, -0.11), Australasia.Min..C. = c(-0.62, 
-0.79, -0.71, -0.73, -0.73, -0.56), Australasia.Max..C. = c(0.15, 
0.03, 0.13, 0.07, 0.05, 0.34), Europe.Year.CE = 1:6, Europe.Temp..C. = c(0.09, 
-0.26, -0.24, 0.22, 0.32, 0.67), Europe.Min..C. = c(-0.69, -1.14, 
-1.18, -0.66, -0.48, -0.11), Europe.Max..C. = c(0.88, 0.56, 0.61, 
1.07, 1.14, 1.5), North.America...Pollen.Year.CE = c(480L, 510L, 
540L, 570L, 600L, 630L), North.America...Pollen.Temp..C. = c(-0.25, 
-0.29, -0.33, -0.34, -0.34, -0.34), North.America...Pollen.Min..C. = c(-0.74, 
-0.7, -0.66, -0.65, -0.64, -0.64), North.America...Pollen.Max..C. = c(0.24, 
0.11, 0, -0.04, -0.04, -0.04), North.America...Trees.Year.CE = c(1204L, 
1214L, 1224L, 1234L, 1244L, 1254L), North.America...Trees.Temp..C. = c(-0.22, 
-0.45, -0.38, -0.87, -0.81, -0.06), North.America...Trees.Min..C. = c(-0.53, 
-0.72, -0.67, -1.12, -1.09, -0.35), North.America...Trees.Max..C. = c(0.04, 
-0.2, -0.11, -0.57, -0.52, 0.18), South.America.Year.CE = 857:862, 
    South.America.Temp..C. = c(-0.3, -0.21, -0.07, -0.38, -0.41, 
    -0.19), South.America.Min..C. = c(-1.12, -1, -0.88, -1.19, 
    -1.22, -0.98), South.America.Max..C. = c(0.53, 0.58, 0.74, 
    0.43, 0.39, 0.61)), row.names = c(NA, 6L), class = "data.frame") ```



  


  [1]: https://i.stack.imgur.com/0sV7a.png

CodePudding user response:

For something as small as this, I'd often just go with a more manual approach.

Given your df above, I specify the lists of countries in the columns and then grepl() on the df columns to select those columns. Then, we rename the columns, return the new dataframe. We can then apply the function to the list of countries and then rbind with do.call.

country_list = c('Antarctica', 'Arctic', 'Asia', 'Australasia', 'Europe', 'North.America...Pollen', 'North.America...Trees', 'South.America')

get_cols = function(country) {
  df_new = df[,grepl(country, colnames(df))]
  df_new$Country = rep(country, nrow(df_new))
  colnames(df_new) = c('Year', 'Temp', 'Min_Temp', 'Max_Temp', 'Country')
  
  return(df_new)
}

df_final = do.call(rbind, lapply(country_list, get_cols))

Hope that returns what you're looking for?

  • Related