Home > Net >  Count Zero and/or Non-Zero bytes in a bytea/string
Count Zero and/or Non-Zero bytes in a bytea/string

Time:11-07

I have a field in bytea format, and I'm trying to calculate how many zero bytes are in the field (postgresql).

Example String:

0x202ee0ed0000000000000000000000000000000000000000000000000000000000014370000000000000000000000000000000000000000000000000000000003f8affe7

I originally tried to do this by counting how often 00 occurred, but this can be inaccurate since that doesn't check if it's a byte or two 0s that happen to be next to each other.

I'm currently using this regex_replace (found from another question) to force a _ in between bytes so that I can then count 00 occurrences, but this slows down the query by multiple magnitudes vs a simple replace on the order of 100k-1m rows: regexp_replace(data::text, '(..)', E'\\1_', 'g') - Produces something like 20_2e_e0...

I'm wondering if anyone knows of a more performant way to count the number of zero or non-zero bytes in a bytea/string?

CodePudding user response:

There is no very simple and efficient way in SQL, I think. The best I can come up with is:

SELECT cardinality(
          string_to_array(
             encode('\xDEADF00D0000', 'escape'),
             '\000'
          )
       ) - 1;

 ?column? 
══════════
        2
(1 row)
  • Related