Home > Enterprise >  Merge rows with specific name then rename them
Merge rows with specific name then rename them

Time:05-09

I have this sample dataset

province  region_vn region_en sub_region_vn sub_region_en province_latin
  <chr>     <chr>     <chr>     <chr>         <chr>         <chr>         
1 Điện Biên Bắc Bộ    Northern  Tây Bắc Bộ    Northwest     Dien Bien     
2 Lạng Sơn  Bắc Bộ    Northern  Tây Bắc Bộ    Northeast     Lang Son    

How do I join the two sub_region_en of Northwest and Northeast and rename it to Northern midlands and mountain areas?

The outcome would be

province  region_vn region_en sub_region_vn sub_region_en                       province_latin
  <chr>     <chr>     <chr>     <chr>         <chr>                             <chr>         
1 Điện Biên Bắc Bộ    Northern  Tây Bắc Bộ    Northern midlands and mountain areas   Dien Bien     
2 Lạng Sơn  Bắc Bộ    Northern  Tây Bắc Bộ    Northern midlands and mountain areas   Lang Son  

I would appreciate any help.

CodePudding user response:

For example, if your dataset is called "df"

You can simply do the following:

for(i in 1:dim(df)[1]){
  if(df$sub_region_en[i] %in% c("Northwest", "Northeast")){
    df$sub_region_en[i] <- "Northern midlands and mountain areas"
  }
}

CodePudding user response:

Another option is to use regular expressions to identify the pattern, and then use gsub() function to substitute the pattern. Here is the step:

# A simplified version of your data
yourdf <- structure(list(region_en = c("Northern", "Northern"), sub_region_en = c("Northwest", 
"Northeast")), class = "data.frame", row.names = c(NA, -2L))

yourdf
#  region_en sub_region_en
#1  Northern     Northwest
#2  Northern     Northeast

# Substitute the data

yourdf$sub_region_en <- gsub("Northwest|Northeast", 
"Northern midlands and mountain areas", 
yourdf$sub_region_en)

# The result
yourdf
#  region_en                        sub_region_en
#1  Northern Northern midlands and mountain areas
#2  Northern Northern midlands and mountain areas
  •  Tags:  
  • r
  • Related