Home > Blockchain >  Remove [ and ] from a dataframe column - R
Remove [ and ] from a dataframe column - R

Time:09-29

Using this code I try to remove the characters [ and ] from column data.

Code:

colA <- 1:4
colB <- ("[123 123;22 34;556 55; 23 22]")
tryDF <- data.frame(colA, colB)
gsub("[","",tryDF$ColB)
gsub("]","",tryDF$ColB)

but I get this error:

Error in gsub("[", "", tryDF$ColB) : 
  invalid regular expression '[', reason 'Missing ']''
In addition: Warning message:
In gsub("[", "", tryDF$ColB) : TRE pattern compilation error 'Missing ']''

Any idea how to solve it?

CodePudding user response:

You can remove all the [ and ] values by:

gsub("[][]", "", tryDF$colB)

Since [ and ] mean to enclose something in regrex, you are enclosing ][

CodePudding user response:

You can do this in many ways:

a) with alternation:

 gsub("\\[|\\]", "", tryDF$colB)

b) with anchors (since, in your case, [ and ] always occur in string-first and, respectively, string-last position):

gsub("^.|.$", "", tryDF$colB)

c) with negative lookahead to exclude whitespace and ;from the \\W character class (with capital W; it matches any non-alphanumeric character, hence also []):

gsub("(?![\\s;])\\W", "", tryDF$colB, perl = TRUE)
  •  Tags:  
  • r
  • Related