Home > Software design >  How can I iterate through Data Frame and create folders as columns in the df and subfolders as rows
How can I iterate through Data Frame and create folders as columns in the df and subfolders as rows

Time:10-07

This is the code I got:

prod_lst = getprod()
prod_df = pd.DataFrame(prod_lst, columns = ['Client'])
for i, j in prod_df.iterrows():
  print(i,j)
  path = '../models/'   i   '/'   j
  isExist = os.path.exists(path)
  if not isExist:
      os.makedirs(path)

'''

  Client

0 OTT

1 DVD

2 OTV

'''

Need Directory like

''' Client

OTT

DVD

OTV

'''

CodePudding user response:

DataFrame.iterrows produces "(index, Series) pairs." Meaning that j is a Series of values representing the row from the DataFrame. To access a specific value from Client we'll need to select the value from the Series:

import os
import pandas as pd

# Sample DF
prod_df = pd.DataFrame([['OTT'], ['DVD'], ['OTV']], columns=['Client'])
# Column to Select
col = 'Client'
# Iterate over rows
for idx, row in prod_df.iterrows():
    path = f'../models/{col}/{row[col]}'
    exists = os.path.exists(path)
    if not exists:
        os.makedirs(path)

Alternatively, if we do not need all the values from the row we can iterate over just the Client column instead:

import os
import pandas as pd

# Sample DF
prod_df = pd.DataFrame([['OTT'], ['DVD'], ['OTV']], columns=['Client'])
# Loop Over Just the Client Column
col = 'Client'
# Loop Over values in just the column
for value in prod_df[col]:
    path = f'../models/{col}/{value}'
    exists = os.path.exists(path)
    if not exists:
        os.makedirs(path)

The folder creation logic could also be simplified with Path.mkdir from pathlib:

from pathlib import Path

import pandas as pd

# Sample DF
prod_df = pd.DataFrame([['OTT'], ['DVD'], ['OTV']], columns=['Client'])
# Loop Over Just the Client Column
col = 'Client'
for value in prod_df[col]:
    Path(f'../models/{col}/{value}').mkdir(parents=True, exist_ok=True)

CodePudding user response:

I have found a method that is working; not sure if it is the best practice.

for value, rows in prod_df.itertuples():
    
    path = '../models/'   prod_df.columns[0]   '/'   rows
    isExist = os.path.exists(path)
    if not isExist:
        os.makedirs(path)
  • Related