Home > Blockchain >  Counting the command line arguments and removing the not needed one in python
Counting the command line arguments and removing the not needed one in python

Time:11-29

I want to write python code which will be run using the following command :

python3 myProgram.py 4 A B C D stemfile

Where 4 is the number of files and A,B,C,D are 4 files.Then I wanted to generate all the combinations of A,B,C,D except the empty one.(A, B, C, D, AB, AC, AD, BC, BD, CD, ABC, ABD, ACD, BCD, ABCD) But before that it will read the stemfile.names and if stemfile.names has a line | Final Pseudo Deletion Count is 0. Then only it will generate the above 15 combination, else it will say noisy data and will not print the combinations of 3 files and not consider D. So the output will be : (A, B, C, AB, AC, BC, ABC)

So in my code what I did is, I always took D as the last file arguments and ran that loop 1 time less. But it is not always true that D will be the last argument only. It can be like : python3 myProgram.py 4 B D C A stemfile

In this case, in my code the A will not be considered while making the combinations, But whenever that line will not be found in the stemfile.names, I just want to remove D file from the equation. How should I do that?

And later in that code, when the combination is A only it will store the A in a seperate outputfile, whenever it is AB then it stores the union of A,B files in a separate files and so on for all the combinations. Here also if there is noisy data then that D file will not come in any of the outputfile.

One more example, If I give : python3 myProgram.py 3 A D B stemfile

And the stemfile.names doesn't have the line | Final Pseudo Deletion Count is 0. then the output combinations are : A,B,AB and it will create 2 output files only.

Below I am attaching my code:

import sys
import itertools
from itertools import combinations


def union(files):
    lines = set()
    for file in files:
        with open(file) as fin:
            lines.update(fin.readlines())
    return lines


def main():
    number = int(sys.argv[1])
    dataset = sys.argv[number 2]

    with open(dataset '.names') as myfile:
        if '| Final Pseudo Deletion Count is 0.' in myfile.read():
            a_list = sys.argv[2:number 2]
            print("All possible combinations:\n")
            for L in range(1, len(a_list) 1):
                 for subset in itertools.combinations(a_list, L):
                     print(*list(subset), sep=',')
            print("...............................")
            matrix = [itertools.combinations(a_list, r) 
                      for r in range(1, len(a_list)   1)]
            combinations = [c for combinations in matrix for c in combinations]
            for combination in combinations:
                filenames = [f'{name}' for name in combination]
                output = f'{"".join(combination)}_output'
                print(f'Writing union of {filenames} to {output}')
                with open(output, 'w') as fout:
                    fout.writelines(union(filenames))

        else:
            a_list = sys.argv[2:number 1]
            # Here I am reducing a number only
            
            print("Noisy data.\n")
            print("So all possible combinations:\n")

            for L in range(1, len(a_list) 1):
                for subset in itertools.combinations(a_list, L):
                    print(*list(subset), sep=',')
            print("................................")
            matrix = [itertools.combinations(a_list, r)
                      for r in range(1, len(a_list)   1)]
            combinations = [c for combinations in matrix for c in combinations]
            for combination in combinations:
                filenames = [f'{name}' for name in combination]
                output = f'{"".join(combination)}_output'
                print(f'Writing union of {filenames} to {output}')
                with open(output, 'w') as fout:
                    fout.writelines(union(filenames))


if __name__ == '__main__':
    main()

Please help me out.

CodePudding user response:

I think you should probably break this down into smaller, more specific questions. It seems like there is a lot of detail here that's not focused on the specific problem you're facing. I took a shot at what I think you're asking, however.

I think you're trying to figure out how to remove an item from the command line arguments. If that's the case, there's nothing you can do about what's passed to the program, but you can modify the list of inputs after you parse. I really think you should try reading about the argparse library, as I stated in my comment. I'm not sure if it's exactly what you're looking for, but here's some code using argparse that expects full filenames for each input file. The last argument must be the stemfile.

Once the arguments are parsed, you have list of pathlib.Path objects. You can simply remove the D file from the list.

import argparse
import itertools
import pathlib

NOISY_DATA_LINE = '| Final Pseudo Deletion Count is 0.'

def get_parser():
    parser = argparse.ArgumentParser()
    parser.add_argument('filenames', type=pathlib.Path, nargs=' ')
    parser.add_argument('stemfile', type=pathlib.Path)
    return parser

def union(files):
    lines = set()
    for file in files:
        with open(file) as fin:
            lines.update(fin.readlines())
    return lines

def main():
    parser = get_parser()
    args = parser.parse_args()

    stemfile_lines = args.stemfile.read_text().splitlines()
    if stemfile_lines[-1] == NOISY_DATA_LINE:
        filenames = [p for p in args.filenames if p.stem != 'D']
    else:
        filenames = args.filenames

    matrix = [itertools.combinations(filenames, r) for r in range(1, len(filenames)   1)]
    combinations = [c for combinations in matrix for c in combinations]
    print(' '.join([str([p.stem for p in c]) for c in combinations]))
    for combination in combinations:
        output = f'{"".join([p.stem for p in combination])}_output.txt'
        print(f'Writing union of {[p.stem for p in combination]} to {output}')
        with open(output, 'w') as fout:
            fout.writelines(union(filenames))

if __name__ == '__main__':
    main()
  • Related