Home > Net >  Split decimal to multiple binary column in data frame
Split decimal to multiple binary column in data frame

Time:12-24

Data frame has a column 'decimal', I need to convert decimal to specific binary columns

Example: 3 (Decimal) --> 0000000000000011 (Binary)

df 
| datetime                | mc | vol | decimal |
|-------------------------|----|-----|---------|
| 2021-11-20 12:04:55.107 | PR | 50  | 1       |
| 2021-11-20 12:04:56.187 | PR | 50  | 1       |
| 2021-11-20 12:04:57.200 | PR | 50  | 3       |
| 2021-11-20 12:04:58.310 | PR | 50  | 3       |
| 2021-11-20 12:04:59.467 | PR | 50  | 5       |
| 2021-11-20 12:05:00.500 | PR | 50  | 5       |

Step 1: With the code I got the below binary table. Binary (0~15)

df_test['binary'] = df.decimal.apply(lambda x: format(int(x), '016b'))

| datetime                | mc | vol | binary           |
|-------------------------|----|-----|------------------|
| 2021-11-20 12:04:55.107 | PR | 50  | 0000000000000001 |
| 2021-11-20 12:04:56.187 | PR | 50  | 0000000000000001 |
| 2021-11-20 12:04:57.200 | PR | 50  | 0000000000000011 |
| 2021-11-20 12:04:58.310 | PR | 50  | 0000000000000011 |
| 2021-11-20 12:04:59.467 | PR | 50  | 0000000000000101 |
| 2021-11-20 12:05:00.500 | PR | 50  | 0000000000000101 |

Step 2: Pick value and create new column

df['B15'] = df['binary'].str[15]
df['B14'] = df['binary'].str[14]
df['B13'] = df['binary'].str[13]
df['B12'] = df['binary'].str[12]
df['B11'] = df['binary'].str[11]

Requirement Below

| datetime                | mc | vol | B11 | B12 | B13 | B14 | B15  |
|-------------------------|----|-----|-----|-----|-----|-----|------|
| 2021-11-20 12:04:55.107 | PR | 50  | 0   | 0   | 0   | 0   | 1    |
| 2021-11-20 12:04:56.187 | PR | 50  | 0   | 0   | 0   | 0   | 1    |
| 2021-11-20 12:04:57.200 | PR | 50  | 0   | 0   | 0   | 1   | 1    |
| 2021-11-20 12:04:58.310 | PR | 50  | 0   | 0   | 0   | 1   | 1    |
| 2021-11-20 12:04:59.467 | PR | 50  | 0   | 0   | 1   | 0   | 1    |
| 2021-11-20 12:05:00.500 | PR | 50  | 0   | 0   | 1   | 0   | 1    |

Is there any other efficient method.

CodePudding user response:

If you only need the last 5 bits, you can use unpackbits:

import pandas as pd
import numpy as np

df = pd.DataFrame({'mc': ['PR', 'PR', 'PR', 'PR', 'PR', 'PR'],
                   'vol': [50, 50, 50, 50, 50, 50],
                   'decimal': [1, 1, 3, 3, 5, 5]})

bits = pd.DataFrame(np.unpackbits(df.decimal.to_numpy(np.uint8)[:, np.newaxis], axis=1)[:,-5:],
                    columns=[f'B{i}' for i in range(11, 16)])
res = pd.concat((df[['mc', 'vol']], bits),axis=1)

Result:

   mc  vol  B11  B12  B13  B14  B15
0  PR   50    0    0    0    0    1
1  PR   50    0    0    0    0    1
2  PR   50    0    0    0    1    1
3  PR   50    0    0    0    1    1
4  PR   50    0    0    1    0    1
5  PR   50    0    0    1    0    1
  • Related