I have a dataframe with the following structure:
Timestamp Payload
2022-03-16 18:00 3c313034393536353534303b31332e32333534313b303b302e303637313b302e303138313b332e33353b36322e31353338323e
2022-03-16 18:10 3c313034393536363134303b31332e32313134383b303b302e303637323b302e3031383b342e3632353b37332e34353133313e
2022-03-16 18:20 3c313034393536363734303b31332e31393630313b303b302e303637323b302e3031383b342e303132353b36322e34343238353e
2022-03-16 18:30 3c313034393537353734303b31332e31303137343b303b302e3036373b302e303137393b342e383132353b37352e343931393e
2022-03-16 18:40 3c313034393538323934303b31332e30353934353b303b302e303636373b302e303137393b332e313337353b37352e37313830393e
What I would like is to create a new dataframe where those two fields appear, and also the decoding of the payload column. The difficulty I have is that the decoding of that set of characters generates a set of data. For example if we decode the following line,
3c313034393536353534303b31332e32333534313b303b302e303637313b302e303138313b332e33353b36322e31353338323e
we obtain,
<1049565540;13.23541;0;0.0671;0.0181;3.35;62.15382>
Where the characters <
, >
and ;
are separators. So my goal would be to obtain something similar to the following dataframe,
2022-03-16 18:00 1049565540 13.23541 0 0.0671 0.0181 3.3500 62.15382 3c313034393536353534303b31332e32333534313b303b302e303637313b302e303138313b332e33353b36322e31353338323e
2022-03-16 18:10 1049566140 13.21148 0 0.0672 0.0180 4.6250 73.45131 3c313034393536363134303b31332e32313134383b303b302e303637323b302e3031383b342e3632353b37332e34353133313e
2022-03-16 18:20 1049566740 13.19601 0 0.0672 0.0180 4.0125 62.44285 3c313034393536363734303b31332e31393630313b303b302e303637323b302e3031383b342e303132353b36322e34343238353e
2022-03-16 18:30 1049575740 13.10174 0 0.0670 0.0179 4.8125 75.49190 3c313034393537353734303b31332e31303137343b303b302e3036373b302e303137393b342e383132353b37352e343931393e
2022-03-16 18:40 1049582940 13.05945 0 0.0667 0.0179 3.1375 75.71809 3c313034393538323934303b31332e30353934353b303b302e303636373b302e303137393b332e313337353b37352e37313830393e
Actually I'm doing the following code, it's ok but I'm sure I'm not taking advantage of the potential of the Pandas
library.
def myFunc(pay):
decode = bytearray.fromhex(pay[2:-2]).decode()
return decode.split(";")
output2['datos'] = output2['payload'].apply(myFunc)
payloads = output2['datos'].values
df = pd.DataFrame(payloads.tolist(), columns=None)
Can it be done in a more efficient way?
Thank you very much!
CodePudding user response:
If want working by special functions is possible use .apply
like your solution or list comprehension:
L = [bytearray.fromhex(pay[2:-2]).decode().split(";") for pay in output2.pop('Payload')]
df = pd.DataFrame(L, index=output2.index)
Last use join
:
df = output2.join(df)
print (df)
Timestamp 0 1 2 3 4 5 6
0 2022-03-16 18:10 1049566140 13.21148 0 0.0672 0.018 4.625 73.45131
1 2022-03-16 18:20 1049566740 13.19601 0 0.0672 0.018 4.0125 62.44285
2 2022-03-16 18:30 1049575740 13.10174 0 0.067 0.0179 4.8125 75.4919
3 2022-03-16 18:40 1049582940 13.05945 0 0.0667 0.0179 3.1375 75.71809
If need also original column:
L = [bytearray.fromhex(pay[2:-2]).decode().split(";") for pay in output2['Payload']]
df = pd.DataFrame(L, index=output2.index)
df = output2.join(df)
print (df)
Timestamp Payload \
0 2022-03-16 18:10 3c313034393536363134303b31332e32313134383b303b...
1 2022-03-16 18:20 3c313034393536363734303b31332e31393630313b303b...
2 2022-03-16 18:30 3c313034393537353734303b31332e31303137343b303b...
3 2022-03-16 18:40 3c313034393538323934303b31332e30353934353b303b...
0 1 2 3 4 5 6
0 1049566140 13.21148 0 0.0672 0.018 4.625 73.45131
1 1049566740 13.19601 0 0.0672 0.018 4.0125 62.44285
2 1049575740 13.10174 0 0.067 0.0179 4.8125 75.4919
3 1049582940 13.05945 0 0.0667 0.0179 3.1375 75.71809
CodePudding user response:
According to your example, separated with blanks.
def myFunc(pay):
arr = []
for val in pay:
pay[val] = bytearray.fromhex(val[2:-2]).decode().split(";")
arr.append(" ".join(pay[val]))
return arr
df["Payload"] = myFunc(df["Payload"])
df
Timestamp Payload
0 2022-03-16 18:00 1049565540 13.23541 0 0.0671 0.0181 3.35 62.15382
1 2022-03-16 18:10 1049566140 13.21148 0 0.0672 0.018 4.625 73.45131
2 2022-03-16 18:20 1049566740 13.19601 0 0.0672 0.018 4.0125 62.44285
3 2022-03-16 18:30 1049575740 13.10174 0 0.067 0.0179 4.8125 75.4919
4 2022-03-16 18:40 1049582940 13.05945 0 0.0667 0.0179 3.1375 75.71809