Home > front end >  R: How to replace several different values in a more efficient way?
R: How to replace several different values in a more efficient way?

Time:05-18

Given this dataset, I would like to know if there is a way to solve this in a more efficient and less repetitive way.

The dataset is made up of a series of courses where some are repeated and written differently (all in capital letters or with extra spaces, etc.)

To solve this problem I use the following method:

# Load dataset
courses <- read.csv('courses.csv')
courses$x[courses$x == 'KOE'] <- 'Koe'
courses$x[courses$x == 'koe'] <- 'Koe'
courses$x[courses$x == 'BENL'] <- 'Benl'
courses$x[courses$x == 'engin'] <- 'Engine'
courses$x[courses$x == 'Fiqh'] <- 'Fiqh Fatwa'
courses$x[courses$x == 'Fiqh fatwa '] <- 'Fiqh Fatwa'
courses$x[courses$x == 'Islamic education'] <- 'Islamic Education'
courses$x[courses$x == 'KIRKHS'] <- 'Kirkhs'
courses$x[courses$x == 'Laws'] <- 'Law'
courses$x[courses$x == 'Pendidikan islam'] <- 'Pendidikan Islam'
courses$x[courses$x == 'Pendidikan Islam '] <- 'Pendidikan Islam'
courses$x[courses$x == 'psychology'] <- 'Psychology'

What other more efficient and less repetitive way exists to solve this problem!?

Thank you very much for reading my question, any advice is welcome!

CodePudding user response:

How about

apply(courses, 2, (function(x) paste(toupper(substr(x, 1, 1)), tolower(substr(x, 2, nchar(x))), sep="")))

The anonymous function converts the first char to upper and the rest to lower. apply, applies to all elements

Something like

courses$y = sapply(courses$x, (function(x) paste(toupper(substr(x, 1, 1)), tolower(substr(x, 2, nchar(x))), sep="")))

would be nice too if you have other columns.

If you can install packages it's always better to use package, stringr has a lot function including str_to_title

str_to_title(courses$x)

will do what you seem to want.

CodePudding user response:

You could do:

courses$x <- tools::toTitleCase(tolower(courses$x))

courses
                   x
1                Koe
2                Koe
3               Benl
4              Engin
5               Fiqh
6         Fiqh Fatwa
7  Islamic Education
8             Kirkhs
9               Laws
10  Pendidikan Islam
11  Pendidikan Islam
12        Psychology

Data:

courses <- data.frame(x = c("KOE", "koe", "BENL", "engin", "Fiqh", "Fiqh fatwa",
                             "Islamic education", "KIRKHS", "Laws", "Pendidikan islam", 
                             "Pendidikan Islam", "psychology"))
  • Related