Home > other >  Put rows in columns and their corresponding values in rows in r
Put rows in columns and their corresponding values in rows in r

Time:06-27

My original dataframe is in this format:

enter image description here

I want to take date in columns and variable Sun and Soil_moisture in rows and name it metric. Their values should be placed horizontally as shown below.

enter image description here

Is it doable in r? Since r doesn't allow numbers to be in columns header I have prefixed with d

My sample dataset:

data <- data.frame(country = c("US","US","US","IND","IND","IND"),
                   state = c("TX","TX","TX","AP","AP","AP"),
                   date = c("20220601","20220602","20220603","20220601","20220602","20220603"),
                   Sun = c(30,35,40,20,30,40),
                   Soil_moisture = c(2.5,3.4,4.5,5.6,6.7,7.8))

I tried the dcast function but variable Sun is not coming into rows.

library(data.table)
dcast(data,country state~date,value.var = "Sun")

Please suggest

CodePudding user response:

You can use a combination of melt and dcast:

library(data.table)

dcast(
  melt(
    setDT(data),
    id.vars = c("country", "state", "date"),
    variable.name = "metric"
  ),
  country   state   metric ~ paste0("d_", date)
)

Output

   country  state        metric d_20220601 d_20220602 d_20220603
    <char> <char>        <fctr>      <num>      <num>      <num>
1:     IND     AP           Sun       20.0       30.0       40.0
2:     IND     AP Soil_moisture        5.6        6.7        7.8
3:      US     TX           Sun       30.0       35.0       40.0
4:      US     TX Soil_moisture        2.5        3.4        4.5

If you left out paste0, then it would return just the date as the column name. But as @jdobres, this is generally not advisable.

CodePudding user response:

Fairly straightforward to do with tidyverse and its pivot_wider and pivot_longer functions:

library(tidyverse)

new_data <- data %>% 
  pivot_longer(c(Sun, Soil_moisture), names_to = 'metric', values_to = 'value') %>% 
  pivot_wider(names_prefix = 'd_', names_from = 'date', values_from = 'value')

  country state metric        d_20220601 d_20220602 d_20220603
  <chr>   <chr> <chr>              <dbl>      <dbl>      <dbl>
1 US      TX    Sun                 30         35         40  
2 US      TX    Soil_moisture        2.5        3.4        4.5
3 IND     AP    Sun                 20         30         40  
4 IND     AP    Soil_moisture        5.6        6.7        7.8

You could actually force pivot_wider to use the dates as column names by omitting the names_prefix argument, but it is generally a bad idea to have numeric column names.

  • Related