I would like to manually correct a record by using R. Last name and first name should always be separated by a comma.
names <- c("ADAM, Smith J.", "JOHNSON. Richard", "BROWN, Wilhelm K.", "DAVIS, Daniel")
Sometimes, however, a full stop has crept in as a separator, as in the case of "JOHNSON. Richard". I would like to do this automatically. Since the last name is always at the beginning of the line, I can simply access it via sub
:
sub("^[[:upper:]] \\.","^[[:upper:]] \\,",names)
However, I cannot use a function for the replacement that specifically replaces the full stop with a comma.
Is there a way to insert a function into the replacement that does this for me?
CodePudding user response:
Your sub
is mostly correct, but you'll need a capture group (the brackets and backreference \\1
) for the replacement.
test_names <- c("ADAM, Smith J.", "JOHNSON. Richard", "BROWN, Wilhelm K.", "DAVIS, Daniel")
sub("^([[:upper:]] )\\.","\\1\\,",test_names)
[1] "ADAM, Smith J." "JOHNSON, Richard" "BROWN, Wilhelm K."
[4] "DAVIS, Daniel"
CodePudding user response:
Can be done by a function like so:
names <- c("ADAM, Smith", "JOHNSON. Richard", "BROWN, Wilhelm", "DAVIS, Daniel")
replacedots <- function(mystring) {
gsub("\\.", ",", names)
}
replacedots(names)
[1] "ADAM, Smith" "JOHNSON, Richard" "BROWN, Wilhelm" "DAVIS, Daniel"