Home > Net >  Creating a unique ID column in R
Creating a unique ID column in R

Time:11-12

I hope I can explain this clearly. I have a 3 month daily timeseries data set that has multiple attribute values for each ID. Values under each attribute are stored in different columns.

Sample data

ID     Date     Var1    Var2
279557 1/1/2020 1       2
279557 1/2/2020 3       4
280485 1/1/2020 5       6
288785 1/2/2020 7       8

The purpose is to aggregate the ID value under each Var for each month, and then plot timeseries line plots for each aggregated ID for under each Var.

I first thought of doing a pivot long

  library(dplyr)
    library(tidyverse)
    

ID = c("297557", "279557", "280485", "280485")
Date = c("1/1/2020", "1/2/2020", "1/1/2020", "1/2/2020")
Var1 = c("1", "3", "5", "7")
Var2 = c("2", "4", "6", "8")

df = data.frame(ID, Date, Var1, Var2)

        df= df%>% 
          pivot_longer(-c(ID, Date))

Output

ID     Date     Value  
279557 1/1/2020 1 
279557 1/1/2020 2
279557 1/2/2020 3
279557 1/1/2020 4 
280485 1/1/2020 5       
280485 1/1/2020 6
280485 1/2/2020 7
280485 1/2/2020 8

Now in order to group (for ggplot2) each Var value under each ID, I am thinking of creating another unique ID column called Pixel. The unique ID should be such that, for example, for the first two values of Var1 for 297557, I can have a unique ID like ID1.Var1under the pixel column.

Pixel     ID     Date     Value  
ID1.Var1  279557 1/1/2020 1 
ID1.Var2  279557 1/1/2020 2
ID1.Var1  279557 1/1/2020 3
ID1.Var2  279557 1/2/2020 4 
ID2.Var1  280485 1/1/2020 5       
ID2.Var2  280485 1/1/2020 6
ID2.Var1  280485 1/2/2020 7
ID2.Var2  280485 1/2/2020 8

Now I can use ggplot

ggplot(data = df, aes(x= Date, y=Value, group = Pixel))  
  geom_line()   
  xlab("") 

How can I create a unique Pixel column automatically without having to manually type each unique ID under the Pixel column in R using dplyr?

CodePudding user response:

You may try

df %>%
  pivot_longer(-c(ID, Date)) %>%
  rowwise %>%
  mutate(Pixel = paste0(c("ID",as.integer(as.factor(ID)),".",name), collapse = "")) %>% select(-name)


  ID     Date     value Pixel   
  <chr>  <chr>    <chr> <chr>   
1 279557 1/1/2020 1     ID1.Var1
2 279557 1/1/2020 2     ID1.Var2
3 279557 1/2/2020 3     ID1.Var1
4 279557 1/2/2020 4     ID1.Var2
5 280485 1/1/2020 5     ID1.Var1
6 280485 1/1/2020 6     ID1.Var2
7 280485 1/2/2020 7     ID1.Var1
8 280485 1/2/2020 8     ID1.Var2

Update

df %>%
  pivot_longer(-c(ID, Date)) %>%
  rowwise %>%
  mutate(Pixel = paste0(c(ID,".",name), collapse = "")) %>% select(-name)

  ID     Date     value Pixel      
  <chr>  <chr>    <chr> <chr>      
1 279557 1/1/2020 1     279557.Var1
2 279557 1/1/2020 2     279557.Var2
3 279557 1/2/2020 3     279557.Var1
4 279557 1/2/2020 4     279557.Var2
5 280485 1/1/2020 5     280485.Var1
6 280485 1/1/2020 6     280485.Var2
7 280485 1/2/2020 7     280485.Var1
8 280485 1/2/2020 8     280485.Var2
  • Related