How can I extract only the hash digits in the quotations in various strings such as:
[file:hashes.'MD5' = '547334e75ed7d4eea2953675b07986b4']
[file:hashes.'SHA1' = '82d29b52e35e7938e7ee610c04ea9daaf5e08e90']
[file:hashes.'SHA256' = 'ff3b45ecfbbdb780b48b4c829d2b6078d8f7673d823bedbd6321699770fa3f84']
I need to extract the digits and insert into a table using this script:
if item['hash'][:12]=='[file:hashes': #it finds the hash string from Json dic like above lists
if item['hash'][22:-2] not in hash_column: #extracts the digits but only for MD5
insert_hash_table(item['hash'][22:-2]) #insert the hash digit
So in the above example if the strings before the '=' changes due to different hashes then I won't be able to be consistent with my piece of code. Is there anyway to extract only the digits after '=' inside the quotations for all type of hashes ? - e.g. 82d29b52e35e7938e7ee610c04ea9daaf5e08e90
CodePudding user response:
Try (regex101):
import re
s = """\
[file:hashes.'MD5' = '547334e75ed7d4eea2953675b07986b4']
[file:hashes.'SHA1' = '82d29b52e35e7938e7ee610c04ea9daaf5e08e90']
[file:hashes.'SHA256' = 'ff3b45ecfbbdb780b48b4c829d2b6078d8f7673d823bedbd6321699770fa3f84']"""
pat = re.compile(r"=\s*'([^'] )'")
for m in pat.findall(s):
print(m)
Prints:
547334e75ed7d4eea2953675b07986b4
82d29b52e35e7938e7ee610c04ea9daaf5e08e90
ff3b45ecfbbdb780b48b4c829d2b6078d8f7673d823bedbd6321699770fa3f84
CodePudding user response:
You could split a string using '=' as a delimiter. Something like this:
hash = "file:hashes.'MD5' = '547334e75ed7d4eea2953675b07986b4'"
result = hash.split('=')[1].strip().strip("'")
print(result)
This code gives me the result:
547334e75ed7d4eea2953675b07986b4