How do I select a file in my directory whose suffix is the name of another file?-CodePudding

I am trying to automate a program in Python that runs in terminal on Ubuntu.

I have a lot of files in my directory, and each file has a sister file associated with it, all in the same directory. The names of the files start with 1 and go all the way up to 500.

For example, there will be files like 1.mp3, 1_hello.mp4, 1_something.mp4, 1_something_else.mp3 and 2.mp3, 2_hello.mp4, 2_something.mp4, 2_something_else.mp3. I will be only concerned with 1.mp3, and 1_hello.mp4, and similarly for all the other files in my directory. Other files with the same suffix don't matter to me. How can I automate this? I have tried this but, it doesn't work.

import os
directory_name = '/home/user/folder/'
for file_name in os.listdir(directory_name):
    if file_name.endswith(".mp3"):
        for sis_file in os.listdir(directory_name):
            if sis_file.endswith("file_name._hello.mp4"):
                os.system("command file_name -a file_name.sis_file file_name.new_file")

The command is written in the os.system line. new_file is created as the result of this operation, and it too must have the suffix of the original file for it to be easily identifiable. Also, for the command in os.system, each file must only be paired only with its sister file, or I will get inconsistent results.

Edit: I will a lot of files in my directory. All the files are sequentially numbered, begining from 1. Each file has many other files associated with it, and all these sister files will have same prefix as that of the parent file. The command that I have to automate is like this

command name_of_the_parent_file.mp3 -a name_of_the_sister_file.txt name_of_the_output_file.mp4

name_of_the_parent_file would be like 1.mp3, 2.mp3, 3.mp3 name_of_the_sister_file would be like 1_hello.txt, 2_hello.txt name_of_the_output_file would be the name of the new file that this command creates.

CodePudding user response：

this is what I gathered from your question.

from glob import glob
import re

path = glob("/home/testing/*") # your file path here
individual_files = [list(map(int, re.findall(r'\d ', x)))[0] for x in path]
individual_files
set_files = list(set(individual_files))
set_files
for i in set_files : 
    files = [path[id] for id, x in enumerate(individual_files) if x == i]
    print(*files, sep = "\n")
    print("\n\n")

the output is separating all the numbered files in the folder:

/home/testing/1.csv
/home/testing/1_something_else.csv
/home/testing/1_something.csv



/home/testing/2_something.csv
/home/testing/2_something_else.csv
/home/testing/2.csv



/home/testing/11_something.csv
/home/testing/11_something_else.csv
/home/testing/11.csv



/home/testing/111.csv
/home/testing/111_something.csv
/home/testing/111_something_else.csv
/home/testing/111_testing.txt

CodePudding user response：

If I get you right you need to run this command on all parent files combined with all paired children files individually:

import os
from glob import glob

files = glob('/home/user/folder/*.*') #get a list of files in the folder
parent_files = [i for i in files if i.rsplit('.', -1)[0].isdigit()] #get parent files by filtering filenames with a digit as filename

for file in parent_files:
    for j in [i for i in files if i.startswith(f"{file.rsplit('.', -1)[0]}_")]: #run command on those files that start with the same digit as the parent files
        os.system(f"command {file} -a {j} {j.rsplit('.', -1)[0]}.extension") # I have no idea what the output file should look like, so I named it after the children file with a random extension