Split work between writing output from .exe and reading the output-CodePudding

I have a simple .exe downloaded from the apple app store. It gives real-time updates on crypto prices and their percentage change. I am extracting percentage changes of bitcoin.

I am using a subprocess to extract the output. I am storing the output into four separate text files, then I'm reading the text from the files, extracting the data I need and saving it into a pandas dataframe.

Additionally, I have a 60 second timeout for each .exe run, and I read the file after 70 seconds. I truncate the file by removing its content, and by the time the next file has output in it, I'll read that file also, then truncate and repeat.

I want to know how to split the work between saving the output into a textfile and reading the content then truncate it. For example, I am running a simple thread that should run the .exe and extract the output with write_truncate_output. However, I only have append_output_executable running.

Here's my script:

from subprocess import STDOUT, check_call as x
import os
from multiprocessing import Pool
import time
import re
from collections import defaultdict
import pandas as pd
from multiprocessing import Process

cmd = [r'/Applications/CryptoManiac.app/Contents/MacOS/CryptoManiac']
text_file = ['bitcoin1.txt','bitcoin2.txt','bitcoin3.txt','bitcoin4.txt']

def append_output_executable(cmd):
    while True:
        i = '1234'
        for num in i:
            try: #append the .exe output to multiple files 
                with open(os.devnull, 'rb') as DEVNULL, open('bitcoin{}.txt'.format(num), 'ab') as f:
                    x(cmd,  stdout=f, stderr=STDOUT, timeout=60)
                
            except:
                pass  


def write_truncate_output(text):
    while True:
        time.sleep(70)
        with open(text, 'r ') as f:
            data = f.read()
            f.truncate(0)
            #read and truncate after reading the data
    
            #filter and format
        percentage=re.findall(r'\bpercent_change_24h:\s.*', data)
        value= [x.split(':')[1] for x in percentage]
        key = [x.split(':')[0] for x in percentage]

        #store in dictionary
        percent_dict = defaultdict(list)

        for ke, val in zip(key, value):
            percent_dict[ke].append(val)
            percent_dict['file'].append(text)

        percent_frame = pd.DataFrame(percent_dict)
    
        print(percent_frame)

if __name__ == '__main__':
    for text in text_file:
        execute_process = Process(target = append_output_executable, args=(cmd,))
        output_process = Process(target = write_truncate_output, args=(text,))
        execute_process.start()
        execute_process.join()
        output_process.start()
        output_process.join()

this stil just runs the first function.

CodePudding user response：

I don't know if this answer resolve all your problems because you have few mistakes in program - when you repair one mistake then it still doesn't work because there are other mistakes.

First:

target in Thread and Process needs function's name without () and arguments - (this is called callback) - and later (when you use .start() ) it will use () to run this function inside new Thread or Process

Thread(target=append_output_executable, args=(cmd,))

Process(target=append_output_executable, args=(cmd,))

Second:

Thread and Process runs this function only once so it needs while-loop to run all time. And it can't use return because it ends function.

Third:

.join() block code because it waits for end of Thread or Process and it should be used after starting all threads/processes - usually it is used at the end of program when you want to stop all threads/processes

And small suggestion(s):

You could use global running = True and inside function while running - and later you can set running = False to stop loops in function (and finish functions)

Code could look like this

# ... other imports ...
from threading import Thread

def append_output_executable(cmd):
    while running:
        # ... code ... (without `return`)

def write_truncate_output(text):
    while running:
        # ... code ... (without `return`)
         
# --- main ---

# global variables

running = True

if __name__ == '__main__':
    
    # --- create and start ---
    
    t0 = Thread(target=append_output_executable, args=(cmd,))
    t0.start()
        
    other_threads = []
   
    for text in text_file:
        t = Thread(target=write_truncate_output, args=(text,))
        t.start()
        other_threads.append(t)
        
    # ... other code ...
    
    # --- at the end of program ---
    
    running = False
    
    # --- wait for end of functions ---
    
    t0.join()
    for t in other_threads:
        t.join()

Exactly the same is with Process

(I keep the same names of variables to show that all is the same)

# ... other imports ...
from multiprocessing import Process

def append_output_executable(cmd):
    while running:
        # ... code ... (without `return`)

def write_truncate_output(text):
    while running:
        # ... code ... (without `return`)

# --- main ---

# global variables

running = True

if __name__ == '__main__':

    # --- create and start ---

    t0 = Process(target=append_output_executable, args=(cmd,))
    t0.start()

    other_threads = []

    for text in text_file:
        t = Process(target=write_truncate_output, args=(text,))
        t.start()
        other_threads.append(t)

    # ... other code ...

    # --- at the end of program ---

    running = False

    # --- wait for end of functions ---

    t0.join()
    for t in other_threads:
        t.join()