Home > database >  Copy and rename pictures based on xml nodes
Copy and rename pictures based on xml nodes

Time:01-29

I'm trying to copy all pictures from one directory (also including subdirectories) to another target directory. Whenever the exact picture name is found in one of the xml files the tool should grap all information (attributes in the parent and child nodes) and create subdirectories based on those node informations, also it should rename the picture file. The part when it extracts all the information from the nodes is already done.

from bs4 import BeautifulSoup as bs

path_xml = r"path\file.xml"

content = []

with open(res, "r") as file:
    content = file.readlines()

content = "".join(content)

def get_filename(_content):
    bs_content = bs(_content, "html.parser")

    # some code
    
    picture_path = f'{pm_1}{pm_2}\{pm_3}\{pm_4}\{pm_5}_{pm_6}_{pm_7}\{pm_8}\{pm_9}.jpg'

get_filename(content)

So in the end I get a string value with the directory path and the file name I want.

Now I struggle with opening all xml files in one directory instead of just opening one file. I tryed this:

import os
dir_xml = r"path"

res = []

for path in os.listdir(dir_xml):
    if os.path.isfile(os.path.join(dir_xml, path)):
        res.append(path)

with open(res, "r") as file:
    content = file.readlines()

but it gives me this error: TypeError: expected str, bytes or os.PathLike object, not list

How can i read through all xml files instead of just one? I have hundreds of xml files so that will take a wile :D

And another question: How can i create directories base on string?

Lets say the value of picture_path is AB\C\D\E_F_G\H\I.jpg

I would need another directory path for the destination of the created folders and a function that somehow creates folders based on that string. How can I do that?

CodePudding user response:

To read all XML files in a directory, you can modify your code as follows:

import os
dir_xml = r"path"

for path in os.listdir(dir_xml):
    if path.endswith(".xml"):
        with open(os.path.join(dir_xml, path), "r") as file:
            content = file.readlines()
        content = "".join(content)
        get_filename(content)

This code uses the os.listdir() function to get a list of all files in the directory specified by dir_xml. It then uses a for loop to iterate over the list of files, checking if each file ends with the .xml extension. If it does, it opens the file, reads its content, and passes it to the get_filename function.

To create directories based on a string, you can use the os.makedirs function. For example:

import os

picture_path = r'AB\C\D\E_F_G\H\I.jpg'
 dest_path = r'path_to_destination'

os.makedirs(os.path.join(dest_path, os.path.dirname(picture_path)), exist_ok=True)

In this code, os.path.join is used to combine the dest_path and the directory portion of picture_path into a full path. os.path.dirname is used to extract the directory portion of picture_path. The os.makedirs function is then used to create the directories specified by the path, and the exist_ok argument is set to True to allow the function to succeed even if the directories already exist.

Finally, you can use the shutil library to copy the picture file to the destination and rename it, like this:

import shutil

src_file = os.path.join(src_path,  picture_path)
dst_file = os.path.join(dest_path, picture_path)

shutil.copy(src_file, dst_file)

Here, src_file is the full path to the source picture file and dst_file is the full path to the destination. The shutil.copy function is then used to copy the file from the source to the destination.

CodePudding user response:

You can use os.walk() for recursive search of files:

import os

dir_xml = r"path"

for root, dirs, files in os.walk(dir_xml):  #topdown=False
    for names in files:
        if ".xml" in names:
            print(f"file path: {root}\n XML-Files: {names}")
            with open(names, 'r') as file:
                content = file.readlines()
  • Related