Python - Searching specific hex bytes in a series of Hex bytes using Python-CodePudding

Input Bytes: 35 04 65 FF D0 00 10 24 D0 01 10 24 E0 20 10 2C 84 D0 05 10 24 D0 07 10 24 I have a series of bytes as above. coming out of a diagnostics tool. I want to search for "E0 20" and need to retrieve the next 3 bytes from that.

eg., The 3 Bytes follows E0 20 is ..... 10 2C 84..

How to do this, what are the inbuilt support available fore this in Python?

regards,. Joseph

I tried and found solution for similar problems. But I as a begineer, I wanted to understand the efficient method of doing this.

CodePudding user response：

Approach 1) Regular expressions

If we generalize this to the problem of extracting string matches, then regular expressions (RegEx) are useful. The general way you solve a string matching problem with RegEx is like this:

Think of what you want to extract, and what the inputs should look like
Create a regex pattern which matches what you're looking for. I suggest adding parentheses around the subpattern you'd like to extract so you can use group pattern extraction.
Optionally, compile the regex for better performance.

Here's a working example which extract the 3 letter following a subpattern.

import re

# Matches on the literal "hello my name is " and then
# makes a group extraction to collect the next 3 letters
pattern  = re.compile("hello my name is ([a-zA-Z]{3})")

# First example shouldn't have a match.
# The following examples should have their 3 first letters extracted.
examples = [
    "",
    "hello my name is Bob",
    "hello my name is Alice"
]

for example in examples:
  matches = pattern.findall(example)
  
  # A match is not guaranteed.
  # findall may return multiple matches as well.
  # Consider changing `findall` to a different regex method
  # as needed.
  if matches:
    print(matches)

# Outputs:
# ['Bob']
# ['Ali']

Approach 2) Do a single loop over the input bytes

In your case, since you're looking for an exact string match, RegEx might be overkill. You can probably get away with doing a single loop over the input string to extract a match, if any. I won't provide an example for this, but here's a sketch:

for i, byte_chunk in enumerate(input_bytes):
   if byte_chunk == TARGET_BYTE_CHUNK:
     do_something_with_byte_chunks(i   2)

CodePudding user response：

The bytes object has a find method that you could use to locate the sequence you are interested in.

For example:

input_data = bytes.fromhex('35 04 65 FF D0 00 10 24 D0 01 10 24 E0 20 10 2C 84 D0 05 10 24 D0 07 10 24')
search_data = bytes.fromhex('E0 20')
data_len = 3

data_start = input_data.find(search_data)   len(search_data)
data_out = input_data[data_start: data_start   data_len]
print(f"{data_out.hex()}")
# '102c84'