Abstract of my problem: I am trying to make an excel sheet where the first column contains the sub folders files. And I want to assign the sub folders name in another row concerning the files length, which is contained by the subfolders.
Let's explain in detail: I have 22 sub folders in a specific directory. Each folder contains 9-12 images. For example, Folder no 1, 2, 3, and 4 has accordingly 12, 12, 12, 9, and 11 images. I took all those 22 sub-folders image names in a column. Now, I want to make another row that will contain the folder name with respect to the images name.
Image Name Folder Name
one1.jpg Folder1
one2.jpg Folder1
one3.jpg Folder1
one4.jpg Folder1
one5.jpg Folder1
one6.jpg Folder1
one7.jpg Folder1
one8.jpg Folder1
one9.jpg Folder1
one10.jpg Folder1
one11.jpg Folder1
one12.jpg Folder1
two1.jpg Folder2
two2.jpg Folder2
two3.jpg Folder2
two4.jpg Folder2
two5.jpg Folder2
two6.jpg Folder2
two7.jpg Folder2
two8.jpg Folder2
two9.jpg Folder2
two10.jpg Folder2
two11.jpg Folder2
two12.jpg Folder2
Tree2.jpg Folder3
.......
I get so close but not to the end:
path1 = os.chdir('/content/drive/MyDrive/Images')
subDirectory = list(os.walk(path1))
sd = subDirectory[0][1]
lengthOFsd = len(sd)
lengthOFsd = (lengthOFsd 1)
# import xlsxwriter module
# !pip install xlsxwriter
import xlsxwriter
path2 = os.chdir('/content/drive/MyDrive/Research/excle')
workbook = xlsxwriter.Workbook('data_path_reader.xlsx')
# By default worksheet names in the spreadsheet will be
# Sheet1, Sheet2 etc., but we can also specify a name.
worksheet = workbook.add_worksheet("First _Sheet")
# Use the worksheet object to write
# data via the write() method.
worksheet.write('A1', 'Image Name')
worksheet.write('B1', 'Folder Name')
current_directory = os.listdir(path1)
subDirectory = list(os.walk(path1))
row = 1
row2 =1
col = 0
col2 = 0
# For 1st column images name
for j in range(1,lengthOFsd):
subd1 = list(subDirectory[j][2])
for imgname in subd1:
worksheet.write(row, col, imgname)
row = 1
# For 2nd column Folders name
lenoffoldImg = len(subDirectory[1][2])
for flodername in range(0,lenoffoldImg):
worksheet.write_column(row2, col2 1, sd)
row2 = 1
workbook.close()
If you want to check the full code click here
CodePudding user response:
Here I have used the library openpyxl
, since I am not used to working with xlsxwriter
. Nonetheless, the logic is the same, and I am positive you can adapt it to your needs.
Here is the solution:
from os import walk, getcwd
from os.path import basename
import openpyxl
wb = openpyxl.Workbook()
sheet = wb.active
sheet.cell(row=1, column=1).value = "Image Name"
sheet.cell(row=1, column=2).value = "Folder Name"
# If you would like to skip the files in the root file remove this variable and its logic.
skip_root_files = true
# The row in which the file and folder's name will be printed.
row_number = 2
for foldername, _, filenames in walk(getcwd()):
# Skip files from root directory logic:
if skip_root_files:
skip_root_files = false
continue
for filename in filenames:
sheet.cell(row=row_number, column=1).value = filename
sheet.cell(row=row_number, column=2).value = basename(foldername)
row_number = 1
wb.save("file.xlsx")
As explained in the comments, if you don't want to crawl the root directory just create a logic to pass the first loop cycle, (i.e the root). I've done it with booleans.
The variable row_number
starts by 2 because it leaves the first row to the titles Image Name
and Folder Name
.
Hope it helps!
CodePudding user response:
I found the solution. There is little change in the loop, which is given in below:
Solution
for j in range(0,lengthOFsd):
subd1 = list(subDirectory[j][2])
foldername = os.path.basename(subDirectory[j][0])
for imgname in subd1:
worksheet.write(row1, col1, imgname)
worksheet.write(row2, col2 1, foldername)
row1 = 1
row2 = 1
Note:- This change for the first loop and remove the second loop