I'm trying to use Multiprocessing to speed-up times.The goal is; processes will queries into domain defined inside a text file. Upon executing; the multiprocesses just doing the same: every process queries from the first line instead of new lines per process. So the main target; each process queries domain listed in the new lines
from source .txt
.
Here's the used code:
class diginfo:
expected_response = 101
control_domain = 'd2f99r5bkcyeqq.cloudfront.net'
payloads = { "Host": control_domain, "Upgrade": "websocket", "DNT": "1", "Accept-Language": "*", "Accept": "*/*", "Accept-Encoding": "*", "Connection": "keep-alive, upgrade", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36" }
file_hosts = ""
result_success = []
num_file = 1
columns = defaultdict(list)
txtfiles= []
hostpath = 'host'
def engines(counts, terminate, reach):
for domain in domainlist:
try:
r = requests.get("http://" domain, headers=headers, timeout=0.7, allow_redirects=False)
if r.status_code == expected_response:
print("Success" domain)
print(domain, file=open("RelateCFront.txt", "a"))
result_success.append(str(domain))
elif r.status_code != expected_response:
print("Failed" domain str(r.status_code))
print(" Loaded : " str(len(diginfo.result_success)))
if len(diginfo.result_success) >= 0:
print(" Successfull Result : ")
for result in diginfo.result_success:
print(" " result)
print("")
while not terminate.is_set():
reach.set()
break
def fromtext():
global headers, domainlist
files = os.listdir(diginfo.hostpath)
for f in files:
if fnmatch.fnmatch(f, '*.txt'):
print( str(diginfo.num_file),str(f))
num_file=diginfo.num_file 1
diginfo.txtfiles.append(str(f))
fileselector = input("Choose Target Files : ")
print("Target Chosen : " diginfo.txtfiles[int(fileselector)-1])
file_hosts = str(diginfo.hostpath) "/" str(diginfo.txtfiles[int(fileselector)-1])
with open(file_hosts) as f:
parseddom = f.read().split()
domainlist = list(set(parseddom))
domainlist = list(filter(None, parseddom))
terminate = Event()
reach = Event()
for counts in range(cpu_count()):
p = Process(target=engines, args=(counts, terminate, reach))
p.start()
reach.wait()
terminate.set()
sleep(3)
exit()
fromtext()
Here's what i have done:
for domain in domainlist:
p = Process(target=engines, args=(domainlist, terminate, reach))
p.start()
It's seems wont respond and resulted in 0 result and infinite processes. I can't pass counts
argument since its only accept 3 arguments. Terminate
and Reach
used to give signal after requirements reached.
CodePudding user response:
You need to split domainlist
up into cpu_count()
sections, and pass each section to a different process.
You're also using the events incorrectly: currently, it will exit 3 seconds after any process finishes, regardless of whether the others are still working.
You should use a Barrier
instead, or just call join()
on each process in fromtext()
:
def engines(domainsublist):
for domain in domainsublist:
...
def fromtext():
...
num_cpus = cpu_count()
processes = []
for process_num in range(num_cpus):
section = domainlist[process_num::num_cpus]
p = Process(target=engines, args=(section,))
p.start()
processes.append(p)
for p in processes:
p.join()
Finally, you've got some race conditions in engines()
: when you write to RelateCFront.txt and when you append to diginfo.result_success
. There are plenty of good solutions for these on SO; I won't try to fix them here.