Home > Blockchain >  strip base64 img string from a html
strip base64 img string from a html

Time:11-11

I have a html file, an base64 image is encoded in this file.

</div><img src='data:image/png;base64,{base64-string} class='pic' style='position:absolute;width:100%;height:100%;index:7102;'/></div></div>

I would like to use bash shell to extract the base64 img string and save to a png file.

My major question is how to strip the base64 string from the html. ({base64-string})

I tried to use xmllint --xpath

but unnecessary string

class='pic' style='position:absolute;width:100%;height:100%;index:7102;'/>

was included.

CodePudding user response:

Your source HTML is problematic from a minimal, reproducible example standpoint. Your tags do not open/close properly, and the single quotes on the src=... are not closed. If the source HTML is altered to:

<div><img src='data:image/png;base64,{base64-string}' class='pic' style='position:absolute;width:100%;height:100%;index:7102;'/></div>

You can extract the base64 string using xmllint and bash parameter expansion:

$ var=$(echo "<div><img src='data:image/png;base64,{base64-string}' class='pic' style='position:absolute;width:100%;height:100%;index:7102;'/></div>" | xmllint --xpath "string(//div/img/@src)" -); echo "${var##*,}"
{base64-string}
  • Related