Home > other >  generating a factor column
generating a factor column

Time:12-01

I'm simulating a dataset that includes order histories. I want to add a weather column that takes on one of the following values: Clear, Rain, Snow.

This is what the dataset looks like

> dput(data[1:10,-7])
structure(list(order_num = c(501073L, 969942L, 1091101L, 590143L, 
390404L, 219429L, 1025827L, 689629L, 694348L, 435848L), date = structure(c(1542344400, 
1552194000, 1550379600, 1534568400, 1523336400, 1563426000, 1595826000, 
1552712400, 1534309200, 1547960400), class = c("POSIXct", "POSIXt"
), tzone = ""), total_sale = c(36.3853391310075, 35.9405038506853, 
55.6254974332793, 47.7214780063544, 61.4086594373677, 32.8631076291332, 
33.3640439679803, 40.8944394660076, 54.9455495252506, 48.12597580998
), season = c("Spring", "Winter", "Winter", "Fall", "Fall", "Spring", 
"Summer", "Summer", "Fall", "Fall"), temp = c(53, 29, 38, 47, 
57, 60, 80, 88, 51, 51), weather = c(NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_)), row.names = c(NA, 
10L), class = "data.frame")

I plan on doing something like this:

skies1 <- c("Clear", "Rain", "Snow")
skies2 <- c("Clear", "Rain")

data <- data %>%
   mutate(weather = ifelse(season == "Winter", *generate random string from skies1*, *generate random string from skies2*))

but am not sure which method will work to fill in the whole weather column

CodePudding user response:

The most simple solution would be to use sample:

data %>%
    rowwise() %>% 
    mutate(weather = ifelse(season == "Winter", sample(skies1, 1), sample(skies2, 1)))
# A tibble: 10 × 6
# Rowwise: 
   order_num date                total_sale season  temp weather
       <int> <dttm>                   <dbl> <chr>  <dbl> <chr>  
 1    501073 2018-11-16 05:00:00       36.4 Spring    53 Rain   
 2    969942 2019-03-10 05:00:00       35.9 Winter    29 Rain   
 3   1091101 2019-02-17 05:00:00       55.6 Winter    38 Snow   
 4    590143 2018-08-18 06:00:00       47.7 Fall      47 Rain   
 5    390404 2018-04-10 06:00:00       61.4 Fall      57 Clear  
 6    219429 2019-07-18 06:00:00       32.9 Spring    60 Rain   
 7   1025827 2020-07-27 06:00:00       33.4 Summer    80 Rain   
 8    689629 2019-03-16 05:00:00       40.9 Summer    88 Clear  
 9    694348 2018-08-15 06:00:00       54.9 Fall      51 Clear  
10    435848 2019-01-20 05:00:00       48.1 Fall      51 Rain   
  • Related