Home > Enterprise >  Standardizing the character strings in multiple rows (R or Unix)
Standardizing the character strings in multiple rows (R or Unix)

Time:03-14

I would like to standardize all those _xxxxxx character strings to the xxxxxxH format in V1 column.

V1          V2      V3
122223H     20      Test kits
122224H     23      Test kits
122225H     42      Test kits
122227H     31      Test kits
_122228     23      Test kits
_122229     57      Test kits
_122231     21      Test kits
122232H     33      Test kits
122234H     22      Test kits
.......     ..      .... ....
.......     ..      .... ....
.......     ..      .... ....
122250H     33      Test kits

I tried to solve it with gsub function in R but couldn't make the exact format that I need. Any kind of suggestions, please!!! Unix based commands are also helpful.

df <- gsub("_","H",c(file$V1))

Outputs;

"H1222228" "H1222229" "H1222231"   

Desired outputs;

V1          V2      V3
122223H     20      Test kits
122224H     23      Test kits
122225H     42      Test kits
122227H     31      Test kits
122228H     23      Test kits
122229H     57      Test kits
122231H     21      Test kits
122232H     33      Test kits
122234H     22      Test kits
.......     ..      .... ....
.......     ..      .... ....
.......     ..      .... ....
122250H     33      Test kits

CodePudding user response:

Just replace the number with the number followed by an H in those cases where the string begins with an underscore:

file  <- data.frame(v1 = c("122227H", "_122231"))
file$v1  <- gsub("_(\\d. )", "\\1H", file$v1)

Output:

"122227H" "122231H"

CodePudding user response:

Try the following, though more elegant solutions may exist:

df <- data.frame(v1 = c("122223H","122224H","122225H","122227H","_122228","_122229"),
           v2 = c(21,23,42,31,23,57),
           v3 = rep("Test Kits", times = 6))


df$newstring <- gsub("_","",c(df$v1))
df$newstring <- ifelse(grepl("H", df$newstring, fixed = TRUE), df$newstring, paste0(df$newstring,"H"))


# > df
# v1 v2        v3 newstring
# 1 122223H 21 Test Kits   122223H
# 2 122224H 23 Test Kits   122224H
# 3 122225H 42 Test Kits   122225H
# 4 122227H 31 Test Kits   122227H
# 5 _122228 23 Test Kits   122228H
# 6 _122229 57 Test Kits   122229H
  • Related