Home > Back-end >  What would be Perl's unpack representation in Java
What would be Perl's unpack representation in Java

Time:07-28

I have a Perl script where it takes a value and unpacks it into a few binary data attributes. Example as below

my ( $atr1, $atr2, $atr3, $atr4 ) = unpack('a3a16a32a*', $original_value);

I would like to know do I achieve the same with Java perhaps using ByteBuffer or some other means.

In this gist https://gist.github.com/enrobsop/8403717 it is done for integers but I'm still not clear on how to handle binary data.

CodePudding user response:

That unpack returns the four strings. Specifically, it returns the first 3 characters of the string value of $original_value as one string, the next 16 as another, the next 32 as another, and the rest as the fourth and final. Put differently, it returns four slices of the string: One from characters 0..2, one from characters 3..18, one from characters 19..50, and one from characters 51 on.

Note that Perl strings are quite different than Java's. Perl strings are strings of 72 bit characters[1] frequently used to store both text in the form of Unicode Code Points[2], and bytes. Java strings, on the other hand, are strings of 16 bit characters used almost exclusively used to store text in the form of UTF-16 byte-pairs.

There is insufficient context to know whether the the string in $original_value is text or bytes (or something else). If it's bytes, the Java equivalent would use arrays of byte values or a ByteBuffer rather than strings, and you'd use four calls to java.util.Arrays's copyOfRange or java.nio.ByteBuffer's get to perform the operation. If it's text, there's no direct equivalent in core Java (according to my limited and ancient knowledge of Java).


  1. Well, the internal encoding supports 72 bit characters, but only 32 or 64 bit characters are supported in practice.

  2. Each character is a value in [0x000000,0x10FFFF], which is larger than a Java char can support. For example, length("\N{U 100000}") is 1.

  • Related