Home > database >  remove characters from column names
remove characters from column names

Time:02-17

I try to remove in R, some characters unwanted from my column names (numbers, . and space) I have column names as follows

My data is tibble

tibble [33 x 38] (S3: tbl_df/tbl/data.frame) $ year : chr [1:33] "1988" "1989" "1990" "1991" ... $ VALOR AGREGADO BRUTO (a precios básicos) : num [1:33] 9906283 11624212 14163419 17400488 19785184 ... $ 1. PRODUCTOS AGRÍCOLAS NO INDUSTRIALES : num [1:33] 831291 911372 1112167 1434213 1532067 ... $ 2. PRODUCTOS AGRÍCOLAS INDUSTRIALES : num [1:33] 143426 214369 231168 341144 282777 ... $ 3. COCA : num [1:33] 118273 153689 195108 190264 199259 ...

And I desired column names were.

tibble [33 x 38] (S3: tbl_df/tbl/data.frame) $ year : chr [1:33] "1988" "1989" "1990" "1991" ... $ VALOR AGREGADO BRUTO (a precios básicos) : num [1:33] 9906283 11624212 14163419 17400488 19785184 ... $ PRODUCTOS AGRÍCOLAS NO INDUSTRIALES : num [1:33] 831291 911372 1112167 1434213 1532067 ... $ PRODUCTOS AGRÍCOLAS INDUSTRIALES : num [1:33] 143426 214369 231168 341144 282777 ... $ COCA : num [1:33] 118273 153689 195108 190264 199259 ...

I want remove number and . from colnames

colnames(data) <- sub("\\1:4\.\\", "", colnames(data))
colnames(data)

Please somebody could help me?

Best! Marcelo

CodePudding user response:

library(stringr)

t <- c("1. PRODUCTOS AGRÍCOLAS NO INDUSTRIALES",
"2. PRODUCTOS AGRÍCOLAS INDUSTRIALES",
"3. SILVICULTURA, CAZA Y PESCA",
"4. PRODUCTOS PECUARIOS")

str_remove(t, "[1-4]. ")
#> [1] "PRODUCTOS AGRÍCOLAS NO INDUSTRIALES" "PRODUCTOS AGRÍCOLAS INDUSTRIALES"   
#> [3] "SILVICULTURA, CAZA Y PESCA"          "PRODUCTOS PECUARIOS"

Created on 2022-02-16 by the reprex package (v2.0.1)

CodePudding user response:

We can use this pattern that reads, replace if it starts with one or more digit followed by a dot and a space.

library(stringr)

data <- c("1. PRODUCTOS AGRÍCOLAS NO INDUSTRIALES",
"2. PRODUCTOS AGRÍCOLAS INDUSTRIALES",
"3. SILVICULTURA, CAZA Y PESCA",
"4. PRODUCTOS PECUARIOS") 
  
str_replace(data, '^\\d \\. ', "")
#> [1] "PRODUCTOS AGRÍCOLAS NO INDUSTRIALES" "PRODUCTOS AGRÍCOLAS INDUSTRIALES"   
#> [3] "SILVICULTURA, CAZA Y PESCA"          "PRODUCTOS PECUARIOS"

Created on 2022-02-16 by the reprex package (v2.0.1)

  • Related