Home > Enterprise >  How can I modify my dataset (I have several columns with the same variable separated by month (wide
How can I modify my dataset (I have several columns with the same variable separated by month (wide

Time:06-09

I'm new to R and I need your help with the next problem.

I have the following dataset

id Country City Accrued_Jan Accrued_Feb Accrued_Mar Paid_Jan Paid_Feb Paid_Mar
01 USA NY 110 110 130 100 100 110
02 ITALY ROME 80 90 100 70 70 90
03 FRANCE PARIS 70 80 90 70 70 90

And the result that I want is the next:

id Country City Month Accrued Paid
01 USA NY Jan 100 100
01 USA NY Feb 110 100
01 USA NY Mar 130 110
02 ITALY ROME Jan 80 70
02 ITALY ROME Feb 90 70
02 ITALY ROME Mar 100 90
03 FRANCE PARIS Jan 70 70
03 FRANCE PARIS Feb 80 70
03 FRANCE PARIS Mar 90 90

Any idea on how to do this? maybe with pivot.longer?

I would like to add a column that identifies the month and keep the name and values of the variables "Accrued" and "Paid" in separate columns

CodePudding user response:

you could follow a tidyverse approach and use a bunch of tidyverse expressions. One way that helps me visualise it is how I would do something like this in excel and take it forward from there. Here is a link to learn about tidyverse.

First you will have to activate the library:

library(tidyverse)

And then the code would look something like this:

df <- df |> pivot_longer(-c(id,Country,City),
                         names_to = "type", 
                         values_to = "amount")

df <- df |> separate(col = c(type),sep = "_",into = c("status","Month"))


df <- df |> pivot_wider(id_cols = -c(status,amount),
                        names_from = status,
                        values_from = amount)

Hope this helps. Happy learning!

CodePudding user response:

Here is a tidyverse solution

library(tidyverse)

df <- dplyr::tribble(
  ~id, ~Country, ~City, ~Accrued_Jan, ~Accrued_Feb, ~Accrued_Mar, ~Paid_Jan, ~Paid_Feb, ~Paid_Mar,
  1, "USA", "NY", 100, 110, 130, 100, 100, 110,
  2, "Italy", "Rome", 80, 90, 100, 70, 70, 90,
  3, "France", "Paris", 70, 80, 90, 70, 70, 90) %>% 
  tidyr::pivot_longer(
    cols = -c(id, Country, City),
    names_to = c("name", "month"),
    names_sep = "_",
    values_to = "value") %>% 
  tidyr::pivot_wider(names_from = name, values_from = value)
  •  Tags:  
  • r
  • Related