Home > Mobile >  Remove letters from my numeric columns doesn't work
Remove letters from my numeric columns doesn't work

Time:09-09

I have a x_train like this (all the columns are object type):

a     b    c
1      2   523f
2     45   52A
3     32    95
4    245    84A
5     86    42
6      7    52
7     45    31
7a    45    712
8b    53    62
194v  34    3

The Y_train only have 0 and 1. I tried to use RF.fit(x_train, Y_train) but I got an error:

could not convert string to float: 7a

I try to have only the numeric value and remove the letters, so I tried to use something like:

x_train = re.findall(r'\d \d ', x['a'])

but it doesn't work. How can I fix this?

CodePudding user response:

Assuming all integers, you can use this for any column that has non-numeric values:

df[col] = df[col].str.replace('\D', '', regex=True).astype(int)
  • Related