Home > OS >  extract hash digits from string
extract hash digits from string

Time:10-05

How can I extract only the hash digits in the quotations in various strings such as:

[file:hashes.'MD5' = '547334e75ed7d4eea2953675b07986b4']

[file:hashes.'SHA1' = '82d29b52e35e7938e7ee610c04ea9daaf5e08e90']

[file:hashes.'SHA256' = 'ff3b45ecfbbdb780b48b4c829d2b6078d8f7673d823bedbd6321699770fa3f84']

I need to extract the digits and insert into a table using this script:

if item['hash'][:12]=='[file:hashes':    #it finds the hash string from Json dic like above lists
     if item['hash'][22:-2] not in hash_column:    #extracts the digits but only for MD5
          insert_hash_table(item['hash'][22:-2])   #insert the hash digit

So in the above example if the strings before the '=' changes due to different hashes then I won't be able to be consistent with my piece of code. Is there anyway to extract only the digits after '=' inside the quotations for all type of hashes ? - e.g. 82d29b52e35e7938e7ee610c04ea9daaf5e08e90

CodePudding user response:

Try (regex101):

import re

s = """\
[file:hashes.'MD5' = '547334e75ed7d4eea2953675b07986b4']
[file:hashes.'SHA1' = '82d29b52e35e7938e7ee610c04ea9daaf5e08e90']
[file:hashes.'SHA256' = 'ff3b45ecfbbdb780b48b4c829d2b6078d8f7673d823bedbd6321699770fa3f84']"""

pat = re.compile(r"=\s*'([^'] )'")

for m in pat.findall(s):
    print(m)

Prints:

547334e75ed7d4eea2953675b07986b4
82d29b52e35e7938e7ee610c04ea9daaf5e08e90
ff3b45ecfbbdb780b48b4c829d2b6078d8f7673d823bedbd6321699770fa3f84

CodePudding user response:

You could split a string using '=' as a delimiter. Something like this:

hash = "file:hashes.'MD5' = '547334e75ed7d4eea2953675b07986b4'"

result = hash.split('=')[1].strip().strip("'")
print(result)

This code gives me the result:

547334e75ed7d4eea2953675b07986b4
  • Related