Home > Enterprise >  How to relevel the levels of a factor variable without transforming it to integer in R?
How to relevel the levels of a factor variable without transforming it to integer in R?

Time:11-03

I want to transform the four categories below to two new categories: zona_a contains (north_east & nothern_central) and zone_b contains the other two categories. Is there a way to achieve that without going through the hassle of transforming the variable to integer and using the ifelse function?

library(plm)
data("Males")

table(Males$residence)

 rural_area      north_east  nothern_central           south 
     85             733             964            1333 

CodePudding user response:

Here you have one tidyverse solution, hope that helps:

library(tidyverse)
Males <- Males %>% 
  mutate(residence = factor(case_when(residence %in% c("north_east", "nothern_central") ~ "zone_a",
                                      residence %in% c("rural_area", "south") ~ "zone_b")))

CodePudding user response:

The levels() function is one way to approach this since it allows you to set new factor levels. You can also do something similar with the labels argument in factor() (not shown).

If using levels() you have to take care to set the new levels based on the current order so I always take a look at them first.

Here's an example:

# Check current levels
levels(Males$residence)
#> [1] "rural_area"      "north_east"      "nothern_central" "south"

# Set new levels in correct order
levels(Males$residence) = c("zone_b", "zone_a", "zone_a", "zone_b")

# Check that this worked
table(Males$residence)
#> 
#> zone_b zone_a 
#>   1418   1697

A "safer" method, where you explicitly have to pair the old and new values, can be done via package forcats using fct_collapse(). (Thanks to @camille for pointing this functions over fct_recode().)

library(forcats)
data(Males)
Males$residence = fct_collapse(Males$residence,
             zone_a = c("north_east", "nothern_central"),
             zone_b = c("rural_area", "south")
)

table(Males$residence)
#> 
#> zone_b zone_a 
#>   1418   1697

Created on 2021-11-02 by the reprex package (v2.0.1)

  • Related