Home > Back-end >  Pandas data frame: convert Int column into binary in python
Pandas data frame: convert Int column into binary in python

Time:05-22

I have dataframe eg. like below

Event['EVENT_ID'] = [ 4162, 4161, 4160, 4159,4158, 4157, 4156, 4155, 4154]

need to convert each row word to binary.

Event['b']=bin(Event['EVENT_ID']) doesn't work 
TypeError: cannot convert the series to <class 'int'>

expected new column with binary, remove 0b and split the column to 16 separate column

bin(4162) = '0b1000001000010'

CodePudding user response:

I don't think using the bin function and working with the str type is particularly efficient. Please consider using bitmasks.

for i in range(16):
    df[f"bit{i}"] = df["EVENT_ID"].apply(lambda x: x & 1 << i).astype(bool).astype(int)

Testing with your data, I have the following results

   EVENT_ID              B  bit0  bit1  bit2  bit3  bit4  bit5  bit6  bit7  \
0      4162  1000001000010     0     1     0     0     0     0     1     0   
1      4161  1000001000001     1     0     0     0     0     0     1     0   
2      4160  1000001000000     0     0     0     0     0     0     1     0   
3      4159  1000000111111     1     1     1     1     1     1     0     0   
4      4158  1000000111110     0     1     1     1     1     1     0     0   
5      4157  1000000111101     1     0     1     1     1     1     0     0   
6      4156  1000000111100     0     0     1     1     1     1     0     0   
7      4155  1000000111011     1     1     0     1     1     1     0     0   
8      4154  1000000111010     0     1     0     1     1     1     0     0   

   bit8  bit9  bit10  bit11  bit12  bit13  bit14  bit15  
0     0     0      0      0      1      0      0      0  
1     0     0      0      0      1      0      0      0  
2     0     0      0      0      1      0      0      0  
3     0     0      0      0      1      0      0      0  
4     0     0      0      0      1      0      0      0  
5     0     0      0      0      1      0      0      0  
6     0     0      0      0      1      0      0      0  
7     0     0      0      0      1      0      0      0  
8     0     0      0      0      1      0      0      0 

CodePudding user response:

Try this

Event = pd.DataFrame({'EVENT_ID': [ 4162, 4161, 4160, 4159,4158, 4157, 4156, 4155, 4154]})
# use bin function in a list comprehension and convert each bit to a separate column
binary_values = pd.DataFrame([list(bin(x)[2:]) for x in Event['EVENT_ID']])
join the new df to Event
df = Event.join(binary_values)
print(df)

   EVENT_ID  0  1  2  3  4  5  6  7  8  9 10 11 12
0      4162  1  0  0  0  0  0  1  0  0  0  0  1  0
1      4161  1  0  0  0  0  0  1  0  0  0  0  0  1
2      4160  1  0  0  0  0  0  1  0  0  0  0  0  0
3      4159  1  0  0  0  0  0  0  1  1  1  1  1  1
4      4158  1  0  0  0  0  0  0  1  1  1  1  1  0
5      4157  1  0  0  0  0  0  0  1  1  1  1  0  1
6      4156  1  0  0  0  0  0  0  1  1  1  1  0  0
7      4155  1  0  0  0  0  0  0  1  1  1  0  1  1
8      4154  1  0  0  0  0  0  0  1  1  1  0  1  0

CodePudding user response:

You can try turn the int Series to binary by applying '{0:b}'.format, then split the list column to multiple columns with pd.DataFrame

df = df.join(pd.DataFrame(df['EVENT_ID'].apply('{0:b}'.format).apply(list).tolist()))
print(df)

   EVENT_ID  0  1  2  3  4  5  6  7  8  9 10 11 12
0      4162  1  0  0  0  0  0  1  0  0  0  0  1  0
1      4161  1  0  0  0  0  0  1  0  0  0  0  0  1
2      4160  1  0  0  0  0  0  1  0  0  0  0  0  0
3      4159  1  0  0  0  0  0  0  1  1  1  1  1  1
4      4158  1  0  0  0  0  0  0  1  1  1  1  1  0
5      4157  1  0  0  0  0  0  0  1  1  1  1  0  1
6      4156  1  0  0  0  0  0  0  1  1  1  1  0  0
7      4155  1  0  0  0  0  0  0  1  1  1  0  1  1
8      4154  1  0  0  0  0  0  0  1  1  1  0  1  0
  • Related