Home > Enterprise >  load csv and add column as a for loop
load csv and add column as a for loop

Time:04-03

I have a very simple process I need to repeat multiple times and across multiple sets of files but cant figure out how to make it a for loop. Essentially, I am just loading a CSV and adding a column each time called "sample_year." Then filling the whole column with the year (which can be found in the name of the file in the same place on each).

please see the process below which I would like to turn into a for loop.

pff_grade2021 = pd.read_csv(r"C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade\Pass Grade 2021.csv")
pff_grade2021['sample_year'] = [2021]
pff_grade2020 = pd.read_csv(r"C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade\Pass Grade 2020.csv")
pff_grade2020['sample_year'] = [2020]
pff_grade2019 = pd.read_csv(r"C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade\Pass Grade 2019.csv")
pff_grade2019['sample_year'] = [2019]
pff_grade2018 = pd.read_csv(r"C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade\Pass Grade 2018.csv")
pff_grade2018['sample_year'] = [2018]
pff_grade2017 = pd.read_csv(r"C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade\Pass Grade 2017.csv")
pff_grade2017['sample_year'] = [2017]
pff_grade2016 = pd.read_csv(r"C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade\Pass Grade 2016.csv")
pff_grade2016['sample_year'] = [2016]
pff_grade2015 = pd.read_csv(r"C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade\Pass Grade 2015.csv")
pff_grade2015['sample_year'] = [2015]
pff_grade2014 = pd.read_csv(r"C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade\Pass Grade 2014.csv")
pff_grade2014['sample_year'] = [2014]

CodePudding user response:

Loop over the years and interpolate it into the path.

grades = {}
for year in range(2014, 2022):
    grades[year] = pd.read_csv(fr"C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade\Pass Grade {year}.csv")
    grades[year]['sample_year'] = [year]

CodePudding user response:

You can extract the list of files from the path and then use for loop as follows:

import os
path = r'C:\Users\yanke\OneDrive - Stetson University, Inc\Personal\final_projects\nfl_prospect_model\prospect_data\qb\pff\passing_grade'
files = os.listdir(path)
for f in files:
    pff_grade = pd.read_csv(r'{}\{}'.format(path,f)
    pff_grade['sample_year'] = f[-8:-4]
    
  • Related