I am trying to get the output of a tool and making a list out of the output. I have managed to get a propper list using regex, but it is not really a list but a string.
I have tried using splitlines()
and split()
to transform it into a propper list but I can't seem to do it.
This is (a part of) the string that needs to be converted to a list:
www.nu.nl
2017.nu.nl
account.nu.nl
accounts.nu.nl
actie.nu.nl
admin.nu.nl
admin-2.nu.nl
admin-2-public.nu.nl
adverteren.nu.nl
www.adverteren.nu.nl
privacy.adverteren.nu.nl
advertorial.nu.nl
api.nu.nl
api2.nu.nl
autodiscover.nu.nl
beta.nu.nl
privacy.beta.nu.nl
brandedcontent.nu.nl
cdn.nu.nl
cmp.nu.nl
editorialinsights.nu.nl
f1.nu.nl
f1spel.nu.nl
facebook.nu.nl
foto.nu.nl
When I use the split() or splitlines() I get the following output:
['\x1b[0m', '\x1b[92mwww.nu.nl\x1b[0m', '\x1b[92m2017.nu.nl\x1b[0m', '\x1b[92maccount.nu.nl\x1b[0m', '\x1b[92maccounts.nu.nl\x1b[0m', '\x1b[92mactie.nu.nl\x1b[0m', '\x1b[92madmin.nu.nl\x1b[0m', '\x1b[92madmin-2.nu.nl\x1b[0m', '\x1b[92madmin-2-public.nu.nl\x1b[0m', '\x1b[92madverteren.nu.nl\x1b[0m', '\x1b[92mwww.adverteren.nu.nl\x1b[0m', '\x1b[92mprivacy.adverteren.nu.nl\x1b[0m', '\x1b[92madvertorial.nu.nl\x1b[0m', '\x1b[92mapi.nu.nl\x1b[0m', '\x1b[92mapi2.nu.nl\x1b[0m', '\x1b[92mautodiscover.nu.nl\x1b[0m', '\x1b[92mbeta.nu.nl\x1b[0m', '\x1b[92mprivacy.beta.nu.nl\x1b[0m', '\x1b[92mbrandedcontent.nu.nl\x1b[0m', '\x1b[92mcdn.nu.nl\x1b[0m', '\x1b[92mcmp.nu.nl\x1b[0m', '\x1b[92meditorialinsights.nu.nl\x1b[0m', '\x1b[92mf1.nu.nl\x1b[0m', '\x1b[92mf1spel.nu.nl\x1b[0m', '\x1b[92mfacebook.nu.nl\x1b[0m', '\x1b[92mfoto.nu.nl\x1b[0m', '\x1b[92mgadgets.nu.nl\x1b[0m', '\x1b[92mgraph.nu.nl\x1b[0m', '\x1b[92mi.nu.nl\x1b[0m', '\x1b[92miphone.nu.nl\x1b[0m', '\x1b[92mlink.nu.nl\x1b[0m', '\x1b[92mlive.nu.nl\x1b[0m', '\x1b[92mlogin.nu.nl\x1b[0m', '\x1b[92mlogin2.nu.nl\x1b[0m', '\x1b[92mm.nu.nl\x1b[0m', '\x1b[92mprivacy.m.nu.nl\x1b[0m', '\x1b[92mmedia.nu.nl\x1b[0m', '\x1b[92mmedia-staging.nu.nl\x1b[0m', '\x1b[92mmediatoolui.nu.nl\x1b[0m', '\x1b[92mmeedoen.nu.nl\x1b[0m', '\x1b[92mmessagent.nu.nl\x1b[0m', '\x1b[92mmetrics.nu.nl\x1b[0m', '\x1b[92mmijnomgeving.nu.nl\x1b[0m', '\x1b[92mmijnomgeving-acc.nu.nl\x1b[0m', '\x1b[92mmijnteam.nu.nl\x1b[0m', '\x1b[92mprivacy.mijnteam.nu.nl\x1b[0m', '\x1b[92mmobi.nu.nl\x1b[0m', '\x1b[92mmobiel.nu.nl\x1b[0m', '\x1b[92mprivacy.mobiel.nu.nl\x1b[0m', '\x1b[92mmsoid.nu.nl\x1b[0m', '\x1b[92mnewsquiz.nu.nl\x1b[0m', '\x1b[92mwww.nu.nu.nl\x1b[0m', '\x1b[92mnumobileapp.nu.nl\x1b[0m', '\x1b[92mold.nu.nl\x1b[0m', '\x1b[92mop.nu.nl\x1b[0m', '\x1b[92morange.nu.nl\x1b[0m', '\x1b[92mpreview.nu.nl\x1b[0m', '\x1b[92mprivacy.nu.nl\x1b[0m', '\x1b[92msecure.nu.nl\x1b[0m', '\x1b[92msentry.nu.nl\x1b[0m', '\x1b[92mservice.nu.nl\x1b[0m', '\x1b[92mshop.nu.nl\x1b[0m', '\x1b[92mwww.shop.nu.nl\x1b[0m', '\x1b[92msimonly-advertorial.nu.nl\x1b[0m', '\x1b[92mprivacy.simonly-advertorial.nu.nl\x1b[0m', '\x1b[92msip.nu.nl\x1b[0m', '\x1b[92mspecials.nu.nl\x1b[0m', '\x1b[92mstaging.nu.nl\x1b[0m', '\x1b[92mwww.staging.nu.nl\x1b[0m', '\x1b[92mapi.staging.nu.nl\x1b[0m', '\x1b[92mtalk-cdn.staging.nu.nl\x1b[0m', '\x1b[92mstaging-shop.nu.nl\x1b[0m', '\x1b[92mstatic.nu.nl\x1b[0m', '\x1b[92mstories.nu.nl\x1b[0m', '\x1b[92mtalk.nu.nl\x1b[0m', '\x1b[92mtalk-cdn.nu.nl\x1b[0m', '\x1b[92mtest.nu.nl\x1b[0m', '\x1b[92mwww.test.nu.nl\x1b[0m', '\x1b[92mapi.test.nu.nl\x1b[0m', '\x1b[92mapi-cms2test.test.nu.nl\x1b[0m', '\x1b[92mtalk-cdn.test.nu.nl\x1b[0m', '\x1b[92mtalk2022-cdn.test.nu.nl\x1b[0m', '\x1b[92mwww-cms2test.test.nu.nl\x1b[0m', '\x1b[92mwww1.test.nu.nl\x1b[0m', '\x1b[92mtest-shop.nu.nl\x1b[0m', '\x1b[92mtest-voordeel.nu.nl\x1b[0m', '\x1b[92mtools.nu.nl\x1b[0m', '\x1b[92mtourtopper.nu.nl\x1b[0m', '\x1b[92murl8180.nu.nl\x1b[0m', '\x1b[92mverkiezingen.nu.nl\x1b[0m', '\x1b[92mvoordeel.nu.nl\x1b[0m', '\x1b[92mwidgets.nu.nl\x1b[0m', '\x1b[92macceptatie.widgets.nu.nl\x1b[0m', '\x1b[92mwintickets.nu.nl\x1b[0m', '\x1b[92mprivacy.wintickets.nu.nl\x1b[0m', '\x1b[92mprivacy.www.nu.nl\x1b[0m', '\x1b[92mwww1.nu.nl\x1b[0m', '\x1b[92mzon.nu.nl\x1b[0m', '\x1b[92mbrandedcontent.oudersvannu.nl\x1b[0m', '\x1b[92mmedia.oudersvannu.nl\x1b[0m']
I figured it were ascii characters and I have tried to filter them out using the
.encode("ascii", "ignore")
and then .decode()
method, but that makes no difference.
My code:
pattern = '(?<=Total Unique Subdomains Found: ..)(?s)(.*$)'
result = subprocess.run(['python3', '/opt/sublist3r/sublist3r.py', '-d', self.domain], stdout=subprocess.PIPE).stdout.decode('utf-8')
regexOutput = re.findall(pattern, result)
print(regexOutput[0]))
This gives me the list that is at the beginning of this post.
Could anyone help me on what to do?
CodePudding user response:
Use regular expression to delete them:
import re
ansi_escape = re.compile(r'\x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~])')
a = ['\x1b[0m', '\x1b[92mwww.nu.nl\x1b[0m', '\x1b[92m2017.nu.nl\x1b[0m', '\x1b[92maccount.nu.nl\x1b[0m', '\x1b[92maccounts.nu.nl\x1b[0m', '\x1b[92mactie.nu.nl\x1b[0m', '\x1b[92madmin.nu.nl\x1b[0m', '\x1b[92madmin-2.nu.nl\x1b[0m', '\x1b[92madmin-2-public.nu.nl\x1b[0m', '\x1b[92madverteren.nu.nl\x1b[0m', '\x1b[92mwww.adverteren.nu.nl\x1b[0m', '\x1b[92mprivacy.adverteren.nu.nl\x1b[0m', '\x1b[92madvertorial.nu.nl\x1b[0m', '\x1b[92mapi.nu.nl\x1b[0m', '\x1b[92mapi2.nu.nl\x1b[0m', '\x1b[92mautodiscover.nu.nl\x1b[0m', '\x1b[92mbeta.nu.nl\x1b[0m', '\x1b[92mprivacy.beta.nu.nl\x1b[0m', '\x1b[92mbrandedcontent.nu.nl\x1b[0m', '\x1b[92mcdn.nu.nl\x1b[0m', '\x1b[92mcmp.nu.nl\x1b[0m', '\x1b[92meditorialinsights.nu.nl\x1b[0m', '\x1b[92mf1.nu.nl\x1b[0m', '\x1b[92mf1spel.nu.nl\x1b[0m', '\x1b[92mfacebook.nu.nl\x1b[0m', '\x1b[92mfoto.nu.nl\x1b[0m', '\x1b[92mgadgets.nu.nl\x1b[0m', '\x1b[92mgraph.nu.nl\x1b[0m', '\x1b[92mi.nu.nl\x1b[0m', '\x1b[92miphone.nu.nl\x1b[0m', '\x1b[92mlink.nu.nl\x1b[0m', '\x1b[92mlive.nu.nl\x1b[0m', '\x1b[92mlogin.nu.nl\x1b[0m', '\x1b[92mlogin2.nu.nl\x1b[0m', '\x1b[92mm.nu.nl\x1b[0m', '\x1b[92mprivacy.m.nu.nl\x1b[0m', '\x1b[92mmedia.nu.nl\x1b[0m', '\x1b[92mmedia-staging.nu.nl\x1b[0m', '\x1b[92mmediatoolui.nu.nl\x1b[0m', '\x1b[92mmeedoen.nu.nl\x1b[0m', '\x1b[92mmessagent.nu.nl\x1b[0m', '\x1b[92mmetrics.nu.nl\x1b[0m', '\x1b[92mmijnomgeving.nu.nl\x1b[0m', '\x1b[92mmijnomgeving-acc.nu.nl\x1b[0m', '\x1b[92mmijnteam.nu.nl\x1b[0m', '\x1b[92mprivacy.mijnteam.nu.nl\x1b[0m', '\x1b[92mmobi.nu.nl\x1b[0m', '\x1b[92mmobiel.nu.nl\x1b[0m', '\x1b[92mprivacy.mobiel.nu.nl\x1b[0m', '\x1b[92mmsoid.nu.nl\x1b[0m', '\x1b[92mnewsquiz.nu.nl\x1b[0m', '\x1b[92mwww.nu.nu.nl\x1b[0m', '\x1b[92mnumobileapp.nu.nl\x1b[0m', '\x1b[92mold.nu.nl\x1b[0m', '\x1b[92mop.nu.nl\x1b[0m', '\x1b[92morange.nu.nl\x1b[0m', '\x1b[92mpreview.nu.nl\x1b[0m', '\x1b[92mprivacy.nu.nl\x1b[0m', '\x1b[92msecure.nu.nl\x1b[0m', '\x1b[92msentry.nu.nl\x1b[0m', '\x1b[92mservice.nu.nl\x1b[0m', '\x1b[92mshop.nu.nl\x1b[0m', '\x1b[92mwww.shop.nu.nl\x1b[0m', '\x1b[92msimonly-advertorial.nu.nl\x1b[0m', '\x1b[92mprivacy.simonly-advertorial.nu.nl\x1b[0m', '\x1b[92msip.nu.nl\x1b[0m', '\x1b[92mspecials.nu.nl\x1b[0m', '\x1b[92mstaging.nu.nl\x1b[0m', '\x1b[92mwww.staging.nu.nl\x1b[0m', '\x1b[92mapi.staging.nu.nl\x1b[0m', '\x1b[92mtalk-cdn.staging.nu.nl\x1b[0m', '\x1b[92mstaging-shop.nu.nl\x1b[0m', '\x1b[92mstatic.nu.nl\x1b[0m', '\x1b[92mstories.nu.nl\x1b[0m', '\x1b[92mtalk.nu.nl\x1b[0m', '\x1b[92mtalk-cdn.nu.nl\x1b[0m', '\x1b[92mtest.nu.nl\x1b[0m', '\x1b[92mwww.test.nu.nl\x1b[0m', '\x1b[92mapi.test.nu.nl\x1b[0m', '\x1b[92mapi-cms2test.test.nu.nl\x1b[0m', '\x1b[92mtalk-cdn.test.nu.nl\x1b[0m', '\x1b[92mtalk2022-cdn.test.nu.nl\x1b[0m', '\x1b[92mwww-cms2test.test.nu.nl\x1b[0m', '\x1b[92mwww1.test.nu.nl\x1b[0m', '\x1b[92mtest-shop.nu.nl\x1b[0m', '\x1b[92mtest-voordeel.nu.nl\x1b[0m', '\x1b[92mtools.nu.nl\x1b[0m', '\x1b[92mtourtopper.nu.nl\x1b[0m', '\x1b[92murl8180.nu.nl\x1b[0m', '\x1b[92mverkiezingen.nu.nl\x1b[0m', '\x1b[92mvoordeel.nu.nl\x1b[0m', '\x1b[92mwidgets.nu.nl\x1b[0m', '\x1b[92macceptatie.widgets.nu.nl\x1b[0m', '\x1b[92mwintickets.nu.nl\x1b[0m', '\x1b[92mprivacy.wintickets.nu.nl\x1b[0m', '\x1b[92mprivacy.www.nu.nl\x1b[0m', '\x1b[92mwww1.nu.nl\x1b[0m', '\x1b[92mzon.nu.nl\x1b[0m', '\x1b[92mbrandedcontent.oudersvannu.nl\x1b[0m', '\x1b[92mmedia.oudersvannu.nl\x1b[0m']
ansi_escape = re.compile(r'\x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~])')
print([ansi_escape.sub('', i) for i in a])
CodePudding user response:
These are escape sequences to be able to print in color to e.g bash
.
\x1b[92
means light green and \x1b[0m
means turn everything off i. e. stop writing in green.