Home > Net >  Pandas read_html ignoring some decimal commas
Pandas read_html ignoring some decimal commas

Time:07-23

Having a html with this table:

A B C
32.412,18 57,77 3,25ml
2.345,44 42,34 4,55ml
2.111,44 31,51 5,12ml

I'm using pandas read_html to read it like a dataframe. But I'm getting an unexpected result as column B decimal comma is being ignored:

A B C
32.412,18 5777 3,25ml
2.345,44 4234 4,55ml
2.111,44 3151 5,12ml

I also tried adding thousands='.', decimal=',' but still not working as expected.

What am I missing? I am using pandas 1.3.5

CodePudding user response:

You can disable thousands with None or ''(empty string)

df = pd.read_html('test.html', thousands=None)[0]
print(df)

           A      B       C
0  32.412,18  57,77  3,25ml
1   2.345,44  42,34  4,55ml
2   2.111,44  31,51  5,12ml
  • Related