Home > Enterprise >  Create a MultiIndex with set_index() transforms 0 and 1 into booleans
Create a MultiIndex with set_index() transforms 0 and 1 into booleans

Time:02-01

I do use DataFrame.set_index() to transform two columns into a MultiIndex. The problem is that values with 0 and 1 are transformed into booleans False and True.

This is the initial table. Please see the values in idx2.

|    | idx1   | idx2   |   val |
|---:|:-------|:-------|------:|
|  0 | A      |        |     1 |
|  1 | B      | False  |     2 |
|  2 | B      | True   |     3 |
|  3 | C      | 0      |     4 |
|  4 | C      | 1      |     5 |
|  5 | C      | 2      |     6 |
|  6 | C      | 3      |     7 |

After doing df.set_index(['idx1', 'idx2']) the table looks like this. Look into the 4th and 5th row please and see that the integers are transformed into booleans.

|              |   val |
|:-------------|------:|
| ('A', '')    |     1 |
| ('B', False) |     2 |
| ('B', True)  |     3 |
| ('C', False) |     4 |  <<<< should be ('C', 0)
| ('C', True)  |     5 |  <<<< should be ('C', 1)
| ('C', 2)     |     6 |
| ('C', 3)     |     7 |

This happens with pandas version 1.5.3.

The question is why this happens and if there is a way to prevent that?

Here is a full MWE

#!/usr/bin/env python3
import pandas

df = pandas.DataFrame({
    'idx1': list('ABBCCCC'),
    'idx2': ['', False, True, 0, 1, 2, 3],
    'val': range(1, 8)
})

print(df.to_markdown())


df = df.set_index(['idx1', 'idx2'])
print(df.to_markdown())

CodePudding user response:

Why this happens?

I think it's better to open an issue on the github.

@Timeless found an enter image description here
enter image description here

  • Related