Home > Software engineering >  Adding data to new columns from a previous one
Adding data to new columns from a previous one

Time:06-02

cars <- data.frame(vehicle.type = c('Skoda Oktavia', 'Hyundai i40', 'Skoda Superb'))

How can i add the data from vehicle.type to two new columns(brand and model)? E.x: brand - Skoda; model - Octavia

CodePudding user response:

Edit after comment

You can separate your column based on the first whitespace using the following code:

library(dplyr)
library(stringr)
library(tidyr)
cars %>%
  mutate(vehicle.type = str_replace(vehicle.type, "\\s", "|")) %>%
  separate(vehicle.type, c("brand", "model"), sep = "\\|")

Output:

    brand    model
1   Skoda  Oktavia
2 Hyundai      i40
3   Skoda   Superb
4 Hyundai Santa Fe

Data

cars <- data.frame(vehicle.type = c('Skoda Oktavia', 'Hyundai i40', 'Skoda Superb', "Hyundai Santa Fe"))

Old answer

You can separate your column using the following code:

library(dplyr)
library(tidyr)
cars %>%
  separate(vehicle.type, c("brand", "model"), " ")

Output:

    brand   model
1   Skoda Oktavia
2 Hyundai     i40
3   Skoda  Superb

CodePudding user response:

EDIT This approach also works for the case you gave in a comment: 'Hyundai Santa Fe'.

You might try str_split_fixed from package {stringr}:

# data 
cars <- data.frame(vehicle.type = c('Skoda Oktavia', 'Hyundai i40', 
                                    'Skoda Superb', 'Hyundai Santa Fe'))

# approach
library(stringr)
cars[, c("brand", "model")] <- str_split_fixed(cars$vehicle.type, " ", 2)

#result
cars
#>       vehicle.type   brand    model
#> 1    Skoda Oktavia   Skoda  Oktavia
#> 2      Hyundai i40 Hyundai      i40
#> 3     Skoda Superb   Skoda   Superb
#> 4 Hyundai Santa Fe Hyundai Santa Fe

Created on 2022-06-02 by the reprex package (v2.0.1)

CodePudding user response:

Another approach is with extract (the regex allows for multiple 'words' in the model part):

library(dplyr)
library(tidyr)
cars %>%
  extract(vehicle.type, 
          into = c("brand", "model"),
          regex = "(\\S )\\s(.*)")
    brand    model
1   Skoda  Oktavia
2 Hyundai      i40
3 Hyundai Santa Fe
4   Skoda   Superb

Data:

cars <- data.frame(vehicle.type = c('Skoda Oktavia', 'Hyundai i40', 'Hyundai Santa Fe', 'Skoda Superb'))

CodePudding user response:

Base R:

data.frame(do.call("rbind", strsplit(as.character(cars$vehicle.type), " ", fixed = TRUE))) |>
  cbind(cars) |>
  set_names(c("brand", "model", "vehicle.type"))
    brand   model  vehicle.type
1   Skoda Oktavia Skoda Oktavia
2 Hyundai     i40   Hyundai i40
3   Skoda  Superb  Skoda Superb
  • Related